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We present a novel certified and complete algorithm to compute arrangements of real planar 
algebraic curves. It provides a geometric-topological analysis of the decomposition of the plane 
induced by a finite number of algebraic curves in terms of a cylindrical algebraic decomposition. 
From a high-level perspective, the overall method splits into two main subroutines, namely an 
algorithm denoted BlSOLVE to isolate the real solutions of a zero-dimensional bivariate system, 
and an algorithm denoted GeoTop to analyze a single algebraic curve. 

Compared to existing approaches based on elimination techniques, we considerably improve 
the corresponding lifting steps in both subroutines. As a result, generic position of the input 
system is never assumed, and thus our algorithm never demands for any change of coordinates. 
In addition, we significantly limit the types of involved exact operations, that is, we only use 
resultant and gcd computations as purely symbolic operations. The latter results are achieved by 
combining techniques from different fields such as (modular) symbolic computation, numerical 
analysis and algebraic geometry. 

We have implemented our algorithms as prototypical contributions to the C++-project 
Cgal. They exploit graphics hardware to expedite the symbolic computations. We have also 
compared our implementation with the current reference implementations, that is, LGP and 
Maple's ISOLATE for polynomial system solving, and CGAL's bivariate algebraic kernel for 
analyses and arrangement computations of algebraic curves. For various series of challenging 
instances, our exhaustive experiments show that the new implementations outperform the 
existing ones. 

Keywords: algebraic curves, arrangement, polynomial systems, numerical solver, hybrid 
methods, symbolic-numeric algorithms, exact computation 



1. Introduction 

Computing the topology of a planar algebraic curve 



can be considered as one of the fundamental problems in real algebraic geometry with numerous 
applications in computational geometry, computer graphics and computer aided geometric 
design. Typically, the topology of C is given in terms of a planar graph Qc embedded in IR 2 
that is isotopic to For a geometric-topological analysis, we further require the vertices of 
Qc to be located on C. In this paper, we study the more general problem of computing an 
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arrangement of a finite set of algebraic curves, that is, the decomposition of the plane into 
cells of dimensions 0, 1 and 2 induced by the given curves. The proposed algorithm is certified 
and complete, and the overall arrangement computation is exclusively carried out in the initial 
coordinate system. Efficiency of our approach is shown by implementing our algorithm based 
on the current reference implementation within CGAlj^] (see also [TJ |2]) and comparing it to the 
most efficient implementations which are currently available. 

From a high-level perspective, we follow the same approach as in [H [2]. That is, the ar- 
rangement computation is reduced to the geometric-topological analysis of single curves and of 
pairs of curves. The main contribution of this paper is to provide novel solutions for the basic 
subtasks needed by these analysis, that is, isolating the real solutions of a bivariate polynomial 
system (BlSOLVE) and computing the topology of a single algebraic curve (GeoTop). 

BlSOLVE: For a given zero- dimensional polynomial system f(x,y) = g(x,y) = (i.e. there 
exist only finitely many solutions), with f,gE Z[x,y], the algorithm computes disjoint boxes 
Bi, . . . , B m C M 2 for all real solutions, where each box Bi contains exactly one solution (i.e. B ri 
is isolating). In addition, the boxes can be refined to an arbitrary small size. BlSOLVE is a 
classical elimination method which follows the same basic idea as the GRID method from [3J 
for solving a bivariate polynomial system, or the INSULATE method from [3] for computing the 
topology of a planar algebraic curvej^] Namely, all of them consider several projection directions 
to derive a set of candidates of possible solutions and eventually identify those candidates which 
are actually solutions. 

More precisely, we separately eliminate the variables x and y by means of resultant compu- 
tations. Then, for each possible candidate (represented as a pair of projected solutions in x- 
and y-direction), we check whether it actually constitutes a solution of the given system or not. 
The proposed method comes with a number of improvements compared to the aforementioned 
approaches and also to other existing elimination techniques [H El El E] • First, we considerably 
reduce the amount of purely symbolic computations, namely, our method only demands for 
resultant computation of bivariate polynomials and gcd computation of univariate polynomials. 
Second, our implementation profits from a novel approach j9j [101 [TT] to compute resultants and 
gcds exploiting the power of Graphics Processing Units (GPUs). Here, it is important to remark 
that, in comparison to the classical resultant computation on the CPU, the GPU implementation 
is typically more than 100-times faster. Our experiments show that, for the huge variety of 
considered instances, the symbolic computations are no longer a "global" bottleneck of an elimina- 
tion approach. Third, the proposed method never uses any kind of a coordinate transformation, 
even for non- generic input The latter fact is due to a novel inclusion predicate which combines 
information from the resultant computation and a homotopy argument to prove that a certain 
candidate box is isolating for a solution. Since we never apply any change of coordinates, our 
method particularly profits in the case where / and g are sparse, or where we are only interested 
in solutions within a given "local" box. Finally, we integrated a series of additional filtering 
techniques which allow us to considerably speed up the computation for the majority of instances. 

GeoTop: There exist a number of certified and complete approaches to determine the 
topology of an algebraic curve; we refer the reader to [121 [131 IT31 [151 [16] for recent work and 
further references. At present, only the method from |13| has been extended to arrangement 



Computational Geometry Algorithms Library, |www . cgalT org ; see also http://exac us.mpi.-inf. mpg . de/ 
cgi-bin/xalci ■ cgi| for an online demo on arrangement computation. 

J For the analysis of a planar curve C = {(x,y) £ M. 2 : f(x,y) — 0}, it is crucial to find the solutions of 
/ = f v = 0. The method in jl] uses several projection directions to find these solutions. 

The system / = g = is non-generic if there exist two solutions sharing a common coordinate. 
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computations of arbitrary algebraic curves [T]. Common to all of these approaches is that, in 
essence, they consider the following three phases: 

1. Projection: Elimination techniques (e.g. resultants) are used to project the x-critical points 
(i.e. points p on the (complex) curve C = {(x,y) € C 2 : f(x,y) = 0} with f y (p) = 0) of 
the curve into one dimension. The so obtained projections are called x-critical values. 

2. Lifting: For all real x-critical values a (as well as for real values in between), we compute 
the fiber, that is, all intersection points of C with a corresponding vertical line x = a. 

3. Connection (in the analysis of a single curve): The so obtained points are connected by 
straight line edges in an appropriate manner. 

In general, the lifting step at an x-critical value a has turned out to be the most time- 
consuming part because it amounts to determining the real roots of a non square-free univariate 
polynomial f a (y) := f(a,y) G K[y] with algebraic coefficients. In all existing approaches, the 
high computational cost for computing the roots of f a is mainly due to a more comprehensive 
algebraic machinery such as the computation of subresultants (in [B EH E] ) , Grobner basis or 
a rational univariate representation (in [12]) in order to obtain additional information on the 
number of distinct real (or complex) roots of / a , or the multiplicities of the multiple roots of f a . 
In addition, all except the method from [12] consider a shearing of the curve which guarantees 
that the sheared curve has no two x-critical points sharing the same x-coordinate. This, in turn, 
simplifies the lifting as well as the connection step but for the price of giving up sparseness 
of the initial input. It turns out that considering such an initial coordinate transformation 
typically yields larger bitsizes of the coefficients and considerably increased running times; see 
also [16j for extensive experiments. 

For GeoTop, we achieved several improvements in the lifting step. Namely, as in the 
algorithm BlSOLVE, we managed to reduce the amount of purely symbolic computations, that 
is, we only use resultants and gcds, where both computations are outsourced again to graphics 
hardware. Furthermore, based on a result from Teissier [T71 [18] which relates the intersection 
multiplicities of the curves /, f x and f y , and the multiplicity of a root of f a , we derive additional 
information about the number n a of distinct complex roots of f a . In fact, we compute an 
upper bound n+ which matches n a except in the case where the curve C is in a very special 
geometric location. In the lifting phase, we then combine the information about the number 
of distinct roots of f a with a certified numerical complex root solver [19] to isolate the roots 
of f a . The latter symbolic- numeric step applies as an efficient filter denoted LlFT-NT that is 
effective in almost all cases. In case of a rare failure (due to a special geometric configuration), 
we fall back to a complete method LlFT-BS which is based on BlSOLVE. In addition, we also 
provide a simple test based on a single modular computation only to detect (in advance) special 
configurations, where LlFT-NT may fail. Considering a generic coordinate transformation, it 
can be further proven that LlFT-NT generally succeeds. We remark that the latter result is 
more of theoretical interest since our experiments hint to the fact that combining LlFT-NT and 
LlFT-BS typically yields better running times than Lift-NT on its own using an additional 
shearing. 

Experiments. We implemented GeoTop in a topic branch of Cgal. Our implementation 
uses the combinatorial framework of the existing bivariate algebraic kernel (Ak_2 for short) 
which is based on the algorithms from [H [13] . Intensive benchmarks [T3| [T6] have shown that 
Ak_2 can be considered as the current reference implementation. In our experiments, we run 
Ak_2 against our new implementation on numerous challenging benchmark instances; we also 
outsourced all resultant and gcd computations within Ak_2 to the GPU which allows a better 
comparison of both implementations. Our experiments show that GeoTop outperforms Ak_2 
for all instances. More precisely, our method is, on average, twice as fast for easy instances such 
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as non-singular curves in generic position, whereas, for hard instances, we typically improve by 
large factors between 5 and 50. The latter is mainly due to the new symbolic-numeric filter 
LlFT-NT, the exclusive use of resultant and gcd computations as the only symbolic operations, 
and the abdication of shearing. Computing arrangements mainly benefit from the improved 
curve-analyses, the improved bivariate solver (see below), and from avoiding subresultants and 
coordinate transformations for harder instances. 

We also compared the bivariate solver BlSOLVE with two currently state-of-the-art imple- 
mentations, that is, Isolate (based on Rs by Fabrice Rouillier with ideas from [7]) and Lgp 
by Xiao-Shan Gao et al. [20], both interfaced in Maple 14. Again, our experiments show that 
our method is efficient as it outperforms both contestants for most instances. More precisely, it 
is comparable for all considered instances and typically between 5 and 10-times faster. 

From our experiments, we conclude that the considerable gain in performance of BlSOLVE 
and GeoTop is due to the following reasons: Since our algorithms only use resultant and 
gcd computations as purely symbolic operations they beat by design other approaches that 
use more involved algebraic techniques. As both symbolic computations are outsourced to the 
GPU, we even see tremendously reduced cost, eliminating a (previously) typical bottleneck. 
Moreover, our filters apply to many input systems and, thus, allow a more adaptive treatment of 
algebraic curves. Our initial decision to avoid any coordinate transformation has turned out to 
be favorable, in particular, for sparse input and for computing arrangements. In summary, from 
our experiments, we conclude that instances which have so far been considered to be difficult, 
such as singular curves or curves in non-generic position, can be handled at least as fast as 
seemingly easy instances such as randomly chosen, non-singular curves of the same input size. 

We would like to remark that preliminary versions of this work have already been presented 
at ALENEX 2011 [21j and SNC 2011 [22j. A recent result |23] on the complexity of BlSOLVE 
further shows that it is also very efficient in theory, that is, the bound on its worst case bit 
complexity is by several magnitudes lower than the best bound known so far for this problem. 
In comparison to the above mentioned conference papers, this journal version comes along 
with a series of improvements: First, we consider a new filter for BlSOLVE which is based on a 
certified numerical complex root solver. It allows us to certify a box to be isolating for a solution 
(a, P) 6 I 2 in a generic situation, where no further solution with the same x-coordinate exists. 
Second, the test within GeoTop to decide in advance whether LlFT-NT applies, and the proof 
that LlFT-NT applies to any curve in a generic position have not been presented before. The 
latter two results yield a novel complete and certified method TOP-NT (i.e. GeoTop with 
LlFT-NT only, where LlFT-BS is disabled) to compute the topology of an algebraic curve. 



Outline. The bivariate solver BlSOLVE is discussed in Section [2} In Section |3j we introduce 
GeoTop to analyze a single algebraic curve. The latter section particularly features two 
parts, that is, the presentation of a complete method LlFT-BS in Section 3.2.1 that is based 



on BlSOLVE, and the presentation of the symbolic-numeric method LlFT-NT in Section 3.2.2 



LlFT-NT uses a numerical solver whose details are given in Appendix A. BlSOLVE and GeoTop 
are finally utilized in Section [4] in order to enable the computation of arrangements of algebraic 
curves. The presented algorithms allow speedups, among other things, due to the use of graphics 
hardware for symbolic operations as described in Section [5j Our algorithms are prototypically 
implemented in the CGAL project. Section [6] gives necessary details and also features many 
experiments that show the performance of the new approach. We conclude in Section [7] and 
outline further directions of research. 
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2. Bisolve: Solving a Bivariate System 

The input of our algorithm is the following polynomial system 



f{x,y) 



X] fijz'y* = and 9(x,y) = 



0. 



(2.1) 



i,j£N:i+j<m 



i,j£N:i+j<n 



where /, g G Z[x, y] are polynomials of total degrees m and n, respectively. It is assumed that 
/ and g have no common factors; otherwise, / and g have to be decomposed into common and 
non-common factors first, and then the finite-dimensional solution set has to be merged with 
the one- dimensional part defined by the common factor (not part of our algorithm). Hence, the 
set Vc := {(x,y) G C 2 \f(x, y) = g(x,y) = 0} of (complex) solutions of (2.1) is zero-dimensional 
and consists, by Bezout's theorem, of at most m ■ n distinct elements. 

Our algorithm outputs disjoint boxes C R 2 such that the union of all B^ contains all real 
solutions 

V w := {(x,y) G R 2 \f(x,y) = g(x,y) = 0} = V c nf 2 



of (2.1) and each Bk is isolating, that is, it contains exactly one solution. 



Notation. We also write 

Tfix m v n x riy 

f(x,y) = J2ft\y)x l = and g(x,y) = £ = ^^(x)^, 

i=0 i=0 i=0 i=0 

where , g^ G Z[x], f- x \ g^ G Z[y] and m x , n x and m y , n y denote the degrees of / and g 
considered as polynomials in x and y, respectively. For an interval / = (a, b) C R, rrij := (a+b)/2 
denotes the center and rj := (b — a)/2 the radius of /. For an arbitrary m G C and r G R + , 
A r (m) denotes the disc with center m and radius r. 

Resultants. Our method is based on well known elimination techniques. We consider the 
projections 



V, 



(x) 

(y) 



{x G C\3y G C with f(x, y) = g(x, y) = 0}, 
{y G C\3x G C with f(x, y) = g(x, y) = 0} 



of all complex solutions Vc onto the x- and y-coordinate. Resultant computation is a well studied 
tool to obtain an algebraic description of these projection sets, that is, polynomials whose roots 
are exactly the projections of the solution set Vc- The resultant w- v ' = ies(f,g;y) G Z[x] of / 
and g with respect to the variable y is the determinant of the (m y + n y ) x [m y + n y ) Sylvester 
matrix : 



S (y) (f,g):- 



Ay) Ay) 

J m v Jm.y-1 



Jo 



(y) 



o n f iy) f vy> 

U . . . U Jm y J m i 

(y) Ay) Ay) 



9>i 







9n y -l 



c(y) 

my- 

9 ( o V 



u g Uy 



9n 



(y) 



Ay) 
Jo 



From the definition, it follows that RS y '(x) is a polynomial in x of degree less than or equal 
to m • n. The resultant RS X ) = res(f,g;x) G Z[y] of / and g with respect to x is defined in 
completely analogous manner by considering / and g as polynomials in x instead of y. As 
mentioned above, the resultant polynomials have the following important property (see [23] for 
a proof): 
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Theorem 1. The roots of RS y '(x) are exactly the projections of the solutions of (2.1) onto the 



x-coordinate and the roots of the greatest common divisor h^ y \x) := gcd(f my (x),g ny (x)) of the 
leading coefficients of f and g. More precisely, 



{x G C\R {y) (x) = 0} = V^ x) U {x G C\h {y \x) = 0} 



For R( x \y), a corresponding result holds: 



{y G C\R^(y) = 0} = U {y G C\h^(y) = 0}, 

where h^ x \y) := gcd( f mx (y), g nx (y)). The multiplicity of a root a of (R^ ) is the surr^ of 
the intersection multiplicities of all solutions of (2.1) with x-coordinate (y- coordinate) a. 



Overview of the Algorithm. We start with the following high level description of the proposed 
algorithm which decomposes into three subroutines: In the first phase (BlPROJECT, see 



Section 2.1), we project the complex solutions Vc of (|2.l[) onto the x- and onto the y-axis. 
More precisely, we compute the restrictions vJf^ 



and Vv 



(m) 



V}. y) n R of the 



complex projection sets and vS y ^ to the real axes and isolating intervals for their elements. 



Obviously, the real solutions Vr are contained in the cross product C := V^ x) x V^ y> CR 2 . In 
the second phase (SEPARATE, see Section 2.2), we compute isolating discs which "well separate" 
the projected solutions from each other. The latter step prepares the third phase (VALIDATE, 



r(v) 



see Section 2.3) in which candidates of C are either discarded or certified to be a solution of 



(2.1). Our main theoretical contribution is the introduction of a novel predicate to ensure that a 
certain candidate (a, ff) G C fl Vr actually fulfills f(a,/3) = g(a, f3) = (cf. Theorem |4j) . F or 
candidates (a, /3) G C\Vr, interval arithmetic suffices to exclude (a, (3) as a solution of ( |2.1[ ). 

We remark that, in order to increase the efficiency of our implementation, we also introduce 
additional filtering techniques to eliminate many of the candidates in C. However, for the sake 
of clarity, we refrain from integrating our filtering techniques into the following description of 



the three subroutines. Section 5.1 briefly discusses a highly parallel algorithm on the graphics 



hardware to accelerate computations of the resultants the gcds needed in the first step, while 



the filtering techniques for VALIDATE are covered in Section 5.2 



2.1. BlPROJECT 

We compute the resultant R := R^ = res(f,g;y) G Z,[x] and a square-free factorization 
of R. More precisely, we determine square- free and pairwise coprime factors n G Z[x], i = 
1, . . . , deg(-R), such that R(x) = H^=i ( r i( x )Y- We remark that, for some i G {1, . . . , deg(i?)}, 
ri(x) = 1. Yun's algorithm [25l Alg. 14.21] constructs such a square-free factorization by essen- 
tially computing greatest common divisors of R and its higher derivatives in an iterative way. 
Next, we isolate the real roots ojjj, j — 1, . . . , £i, of the polynomials r^. That is, we determine 
disjoint isolating intervals I(ceij) C R such that each interval I(ati j) contains exactly one root 
(namely, ctjj) of and the union of all J(ofij), j — 1, . . . ,£i, covers all real roots of r^. For the 
real root isolation, we consider the Descartes method |26l [27] as a suited algorithm. From the 
square-free factorization we know that atij,j = 1, . . . , £{, is a root of R with multiplicity i. 



5 For a root a of h^ v \x) (or h^(y)), the intersection multiplicity of / and g at the "infinite point" (a, oo) (or 
(oo, a)) has also been taken into account. For simplicity, we decided not to consider the more general projective 
setting. 

6 The multiplicity of a solution (xo>2/o) of (2.1 1 is defined as the dimension of the localization of C[x,y]/(f, g) 
at (xo,yo) considered as C-vector space (cf. [24, p. 148]) 
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2.2. Separate 

We separate the real roots of R = from all other (complex) roots of R, an operation 
which is crucial for the final validation. More precisely, let a = aj 0J0 be the jo-th real root of the 
polynomial rj , where iq G {1, . . . , deg(-R)} and jo £ {1> • • • ,&i } are arbitrary indices. We refine 
the corresponding isolating interval I = (a,b) := 1(a) such that the disc Ag ri (mi) does not 
contain any root of R except a. For the refinement of /, we use quadratic interval refinement 
(QIR for short) |28| 129] which constitutes a highly efficient method because of its simple tests 
and the fact that it eventually achieves quadratic convergence. 

In order to test whether the disc A 8r7 (mj) isolates a from all other roots of R, we consider 
an approach which was also used in [30] . It is based on the following test: 



V;>,.r): pirn) 



k>l 



k\ 



r k > 0, 



where p G M[x] denotes an arbitrary polynomial and m, r, K arbitrary real values. Then, the 
following theorem holds Q 

Theorem 2. Consider a disk A = A m (r) C C with center m and radius r. 

1. If T^(m, r) holds for some K > 1, then the closure A of A contains no root of p. 

2. If T^-(m, r) holds for a K > y/2, then A contains at most one root of p. 
Proof. (1) follows from a straight-forward computation: For each z G A, we have 

p(z) = p(m + (z — m)) = p(m) + — — — (z — m) k , 



k>l 



and thus 



P( z )\ \ 1 1 V- \p {k) {m)\, k ( 1 



>i--±-.y 

nl m i < ^ 



z — ml > 



\p(m)\ \p(m)\ ^ k\ " V K 



since \z — m\ < r and X^(m,r) holds. In particular, for K > 1, the above inequality implies 
\p(z)\ > and, thus, p has no root in A. 

It remains to show (2): If T^-(m, r) holds, then, for any point z G A, the derivative p'(z) 
differs from p'(m) by a complex number of absolute value less than \p'(m)\/K. Consider the 
triangle spanned by the points 0, p'(m) and p'(z), and let a and (3 denote the angles at the 
points and p'(z), respectively. From the Sine Theorem, it follows that 

i ■ i \ 1 1 \ a m l sin 7l 1 
\^a\ = \p(m)-p(z)\. ¥m <-. 

Thus, the arguments of p'[m) and p'(z) differ by less than arcsin(l/i^) which is smaller than or 
equal to 7r/4 for K > y/2. Assume that there exist two roots a, b G A of p. Since a = b implies 
p'(a) = 0, which is not possible as Tf (m, r) holds, we can assume that a ^ b. We split p into its 
real and imaginary part, that is, we consider p(x + iy) = u(x, y) + iv(x, y) where u, v : 1R 2 R 
are two bivariate polynomials. Then, p(a) = p(b) = and so u(a) = v(a) = u(b) = v(b) = 0. 
But u{a) = u(b) = implies, due to the Mean Value Theorem in several real variables, that 
there exists a <fi G [a, b] such that 

V«(0) ± (b-a). 

7 For a similar result, the reader may also consider |31| . where a corresponding test based on interval arithmetic 
only has been introduced. 
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Similarly, v(a) — v(b) =0 implies that there exists a £ G [a, b] such that Vt> (£) _L (6 — a). But 
Vi>(£) = {v x (£),Vy(£)) = (—u y (£),u x (£)), thus, it follows that Vtt(£) || (b — a). Therefore, Vm(/0) 
and Vm(£) must be perpendicular. Since p' = u x + 7t> x = u x — iu y , the arguments of p'(ip) and 
p'(0 mus t differ by tt/2. This contradicts our above result that both differ from the argument 
of p'(m) by less than tt/A, thus, (2) follows. □ 

Theorem [2] now directly applies to the above scenario, where p = Ti and r = 877. More 
precisely, I is refined until (m/,877) and Tj^mj, 8rj) holds for all z 7^ io- If the latter 

two conditions are fulfilled, Ag rj (mi) isolates a from all other roots of R. In this situation, we 
obtain a lower bound L(a) for on the boundary of A(a) := A 2rj (mi-): 

Lemma 1. Lei / be an interval which contains a root a ofri . IfT^^ (mi, 877) and (mi, 877) 
holds for all i 7^ i , then the disc A(a) = A 2r/ (m/) isolates a from all other (complex) roots of 
R and, for any z on the boundary dA(a) of A(a), /ioWs that 

\R(z)\ > L(a) := 2- i °- deg ^| J R(m / - 2r 7 )|. 

Proof. A (a) is isolating as already A 8r[ (mi) is isolating. Then, let /3 7^ a be an arbitrary root 
of i? and d := \(3 — mi\ > 877 the distance between /3 and mi. Then, for any point z G dA(a), 
it holds that 

\z — B\ d—2ri At 1 1 , \z — a\ ri 1 

> -; — - — = 1 ; — — > - and - — - — 7 r > - — > 



\(mi-2n)-8\ d + 2n d + 2ri 2 \(mi - 2rj) - a\ 3r 7 4' 

Hence, it follows that 

\ R ( Z )\ > ( \ Z ~ a \ V° TT l^-/ 3 ! ^ 4 -io 2 -deg(fl) + i 

1^-2^)1 ' ^|( m/ _2r / )-a|y )=q - 2r r ) - /?| ' " 

where each root /3 occurs as many times in the product as its multiplicity as a root of R. □ 

We compute L(a) = 2~ l °~ dcg ^ \R(mi — 2rj)\ and store the interval 1(a), the disc A (a), and 
the lower bound L(a) for |-R(^)| on the boundary dA(a) of A(a). 

Proceeding in exactly the same manner for each real root a of R( y \ we get an isolating 
interval I (at), an isolating disc A (a) = A 2r/ (mj), and a lower bound L(a) for \R^\ on 9A(a). 
For the resultant polynomial R^ = res(f,g;x), BiProject and Separate are processed in 
exactly the same manner: We compute R^ and a corresponding square-free factorization. Then, 
for each real root f3 of R^ x \ we compute a corresponding isolating interval I(/3), a disc A(f3) 
and a lower bound L((3) for \R^\ on dA(/3). 



2.3. Validate 

We start with the following theorem: 

Theorem 3. Let a and (3 be arbitrary real roots of R^ and R^ x \ respectively. Then, 



1. the polydisc A(a,/3) := A (a) x A(/3) C C 2 contains at most one solution of (2.1). If 



A(a,/3) contains a solution of (2.1), then this solution is real valued and equals (a, (3). 
2. For an arbitrary point (z\,z 2 ) G C 2 on the boundary of A(a, 0), it holds that 
I^^OI > L ( a ) *M G dA(a), and \R {x) (z 2 )\ > L((3) if z 2 G dA((3). 
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Proof. (1) is an easy consequence from the construction of the discs A(a) and A(/3). Namely, if 



A (a, f$) contains two distinct solutions of (2.1 ), then they would differ in at least one coordinate. 



Thus, one of the discs A(ot) or A((3) would contain two roots of or R^ x \ Since both discs are 
isolating for a root of the corresponding resultant polynomial, it follows that A (a, {$) contains 



at most one solution. In the case, where A(a,/3) contains a solution of (2.1), this solution must 



be real since, otherwise, A(a,/3) would also contain a corresponding complex conjugate solution 
(/ and g have real valued coefficients). (2) follows directly from the definition of A(a, the 
definition of L(a), L((3) and Lemma [T] □ 



We denote B(a,f3) = 1(a) x /(/?) C IR 2 a candidate box for a real solution of (2.1), where 
a and (3 are real roots of and R^ , respectively. Due to Theorem [3J the corresponding 
"container polydisc" A(a,(3) C C 2 either contains no solution of (2.1), or (a, 0) is the only 



solution contained in A(a, f3). Hence, for each candidate pair (a, (3) G C, it suffices to show that 



either (a, (3) is no solution of (2.1), or the corresponding polydisc A(a } /3) contains at least one 
solution. In the following steps, we fix the polydiscs A(a,/3), whereas the boxes B(a,(3) are 
further refined (by further refining the isolating intervals 1(a) and I((3)). We further introduce 
exclusion and inclusion predicates such that, for sufficiently small B(a,(3), either (a,/3) can be 



discarded or certified as a solution of (2.1) 



In order to exclude a candidate box, we use simple interval arithmetic. More precisely, we 
evaluate \3f(B(a,(3)) and \3g(B(a, /?)), where □/ and \3g constitute box functions for / and g, 
respectively: If either \3f(B(a,f3)) or \3g(B(a, /?)) does not contain zero, then (a, (3) cannot be 



a solution of (2.1). Vice versa, if (a, (3) is not a solution and B(a,f3) becomes sufficiently small, 
then either ^ \3f(B(a,/3)) or ^ \3g(B(a, (3)), and thus our exclusion predicate applies. 
It remains to provide an inclusion predicate, that is, a method that approves that a certain 



candidate (a } /3) G C is actually a solution of (2.1). We first rewrite the resultant polynomial 

RW as 

R(v)( x ) = u M(x,y) ■ f(x,y)+v^(x,y) ■ g(x,y), 

where 

u (») j v (y) 

G 7j[x,y] are cofactor polynomials which can be expressed as determinants of 
corresponding "Sylvester-like" matrices: 



Ay) Ay) 

Jrriy •>rriy — \,y 



(V) 
<Jn y 



(V) 
s n« - 



e (y) 



„(y) 



e (y) 



(y) 



y{y) 



Ay) Ay) 

J my J m.y — 1 



Jy) 

Uriy 



o /, 



(y) 



Ay) 



(y) 

J<n,y- 





..m^, — 1 



The matrices U {v) and V {y) are obtained from (f, g) by replacing the last column with 
vectors (y n y~ l ...10... 0) T and (0 ... y m y~ x . . . \) T of appropriate size, respectively [32, p. 287]. 
Both matrices have size (n y + m y ) x (n y + m y ) and univariate polynomials in x (the first 
n y + m y — 1 columns), or powers of y (only the last column), or zeros as entries. We now aim 
for upper bounds for \u^\ and \v' v '\ on the polydisc A(a,(3). The polynomials «M and i;W 
have huge coefficients and their computation, either via a signed remainder sequence or via 
determinant evaluation, is very costly. Hence, we directly derive such upper bounds from the 
corresponding matrix representations without computing itW and v^: Due to Hadamard's 
bound, is smaller than the product of the 2-norms of the column vectors of U^ y \ The 

absolute value of each of the entries of t/w can be easily upper bounded by using interval 
arithmetic on a box in C 2 that contains the polydisc A(a, (3). Hence, we get an upper bound on 
the 2— norm of each column vector and, thus, an upper bound U(a, (3,u^) for \u^ y '\ on A(a,/3) 
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by multiplying the bounds for the column vectors. In the same manner, we also derive an upper 
bound U(a,P,vM) for | on A(a,(3). With respect to our second projection direction, we 



write R( x "> 



u 



(x) 



/ + ■ g with corresponding polynomials u 



0) v {x) 



G Zi[x,y]. In exactly 



the same manner as done for R^ y \ we compute corresponding upper bounds U(a, (3, u^) and 
U(a, (3, v for \u^\ and \v^\ on A(a,(3), respectively. 

Theorem 4. If there exists an (xo,yo) G A(a,/3) with 

U(a,{3,uM)-\f(x ,y )\ + U(a,p,vM) 



and 



U(a,f3,uW)-\f{x ,y )\ + U(a,f3,vW) 



(2.2) 
(2.3) 

then A(a,(3) contains a solution of (2.1), and thus f(a,/3) 
Proof. The proof uses a homotopy argument. Namely, we consider the parameterized system 
f(x, y)-(l-t)- f(x , yo) = g(x, y) - (1 - t) • g(x , y ) = 0, (2.4) 
where t is an arbitrary real value in [0, 1] . For t 



\g(x ,y )\ < L(a) 

\g(x ,y )\ < L{p), 
0. 



1, (2.4) is equivalent to our initial system 



(2.1). For t = 0, (2.4) has a solution in A(a,/3), namely, (xo,yo). The complex solutions of (2.4) 
continuously depend on the parameter t. Hence, there exists a "solution path" V : [0, 1] h- > C 2 
which connects T(0) = (xq,|/o) with a solution T(l) G C 2 of (2.1). We show that T(t) does not 



leave the polydisc A(a, 0) and, thus, (2.1) has a solution in A(a, /3): Assume that the path 
r(i) leaves the polydisc, then there exists a t! e [0,1] with (x',y') = T(t') G dA(a,(3). We 
assume that x' G dA(a) (the case y' G dA(/3) is treated in analogous manner). Since (x',y') is 
a solution of ( 2.4[ ) for t = t', we must have \ f(x',y')\ < \f(x , y )\ and \g(x',y')\ < \g(x ,y )\. 
Hence, it follows that 

\R(y\ x ')\ = \uW(x', y ')f(x', y ')+vW(x',y')g(x',y')\ 

< \uW(x',y')\ ■ \f(x',y')\ + \vM(x\y')\ ■ \g(x',y')\ 

< U{a,(3M y) ) ■ \f{x^y )\ + U(a,P,vW) ■ \g(x ,y )\ < L(a). 

This contradicts the fact that \R^ y \x')\ is lower bounded by L(a). It follows that A(a, /3) 
contains a solution of (2.1) and, according to Theorem [3l this solution must be (a,/3). □ 



Theorem |4] now directly applies as an inclusion predicate. Namely, in each refinement step of 
B(a, /3), we choose an arbitrary (xq, yo) G B(a, (3) (e.g. the center (m/( a ), mum) of the candidate 
box B(a,(3)) and check whether both inequalities ( |2.2 ) and (2.3) are fulfilled. If (a, (3) is a 
solution of (2.1), then both inequalities eventually hold and, thus, we have shown that (a,/3) is 
a solution. 

We want to remark that the upper bounds U(a,f3,u^), U(a,P,vM), U(a,(3,u^) and 
U(a, (3, v™') are far from being optimal. Nevertheless, our inclusion predicate is still efficient 
since we can approximate the potential solution (a, (3) with quadratic convergence due to the 
QIR method. Hence, the values f{xQ,yo) and g{xo,yo) become very small after a few iterations. 
In order to improve the above upper bounds, we propose to consider more sophisticated methods 
from numerical analysis and matrix perturbation theory [33| 134"] . Finally, we would like to 
emphasize that our meth od applies particularly well to the situation where we are only interested 
in the solutions of (O) within a given box B = [A, B] x [C,D] C 1R 2 . Though Rto (Rto) 



capture all (real and complex) projections of the solutions of the system, we only have to search 
for the real ones contained within the interval [A, B] ([C,D]). Then, only candidate boxes 
within B have to be considered in Separate and Validate. Hence, since the computation of 



the resultants is relatively cheap due to our fast implementation on the GPU (see Section 5.1) 
our method is particularly well suited to search for local solutions. 
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Figure 3.1: The figure on the left shows a curve C with two x-extremal points and one singular 
point (red dots). In the projection phase, these points are projected onto the x-axis and rational 
points separating the x-critical values are inserted (red crosses). In the lifting phase, the 
fibers at the critical values (red dots) and at the points in between (red crosses) are computed. 
In the connection phase, each pair of points connected by an arc of C is determined, and a 
corresponding line segment is inserted. Finally, we obtain a graph that is isotopic to C. 



3. GeoTop: Analysing an Algebraic Curve 



The input of GeoTop is a planar algebraic curve C as defined in (1.1), where / £ Z,[x,y] is 
a square-free, bivariate polynomial with integer coefficients. If / is considered as polynomial in y 
with coefficients fi(x) £ Z[x], its coefficients typically share a trivial content h := gcd(/o ; fi, ■ ■ •)> 
that is, h £ Z. A non-trivial content h £ Z[x]\Z defines vertical lines at the real roots of h. Our 
algorithm handles this situation by dividing out h first and finally merging the vertical lines 
defined by h = and the analysis of the curve C := V(f/h) at the end of the algorithm; see 
[15] for details. Hence, throughout the following considerations, we can assume that h is trivial, 
thus C contains no vertical line. 

The algorithm returns a planar graph Qc that is isotopic to C, where the set V of all vertices 
of Qc is located on C. From a high-level perspective our algorithm follows a classical cylindrical 
algebraic decomposition approach consisting of three phases that we overview next: 



Overview of the Algorithm. In the first phase (PROJECT, see Section 3.1), we project all 
x-critical points (a, 0) £ C (i.e. f(a, (5) = f y {a, 0) = 0) onto the x-axis by means of a resultant 
computation and root isolation for the elimination polynomial. The set of x-critical points 
comprises exactly the points where C has a vertical tangent or is singular. It is well known 
(e.g. see |15t Theorem 2.2.10] for a short proof) that, for any two consecutive real x-critical 
values a and a', C is delineable over / = (a, a'), that is, C|j x k decomposes into a certain number 
mj of disjoint function graphs CV t i, . . . , Cj jTOj . In the second phase (LIFT, see Section 3.2), 
we first isolate the roots of the (square-free) intermediate polynomial f{qi,y) £ Q[y], where 
qj constitutes an arbitrary chosen but fixed rational value in I. This computation yields the 
number mj (= number of real roots of f(qi, y)) of arcs above / and corresponding representatives 
(<liiyi,i) on each arc. We further compute all points on C that are located above an 

x-critical value a, that is, we determine the real roots y a ,i, ■ ■ ■ , Va,m a of each (non square-free) 
fiber polynomial f(a,y) £ M.[y\. For this task, we propose two different novel methods, and 
we show that both of them can be combined in a way to improve the overall efficiency. From 
the latter computations we obtain the vertex set V of Qc as the union of all points (qi,yi t i) 



and (a, y a ,i)- In the third and final phase (Connect, see Section 3.3), which concludes the 
geometric-topological analysis, we determine which of the above vertices are connected via an 
arc of C. For each connected pair (t>i,t>2) £ V, we insert a line segment connecting v± and v^. 
It is then straight-forward to prove that Qc is isotopic to C; see also [T5| Theorem 6.4.4]. We 
remark that we never consider any kind of coordinate transformation, even in the case where C 
contains two or more x-critical points sharing the same x-coordinate. 
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3.1. Project 

We follow a similar approach as in BlPROJECT, that is, we compute the resultant R(x) : = 
res(/, f y ; y) G Z[x] and a square-free factorization of R. In other words, we first determine square- 
free and pairwise coprime factor^Jrj G Z[x], i — 1, . . . , deg(-R), such that i?(x) = nf=i^ ( r «( a; )) l 5 
and then isolate the real roots ay, j = 1, . . . ,£i, of the polynomials r; which in turn are i- 
fold roots of R. The so-obtained isolating intervals have rational endpoints, and we denote 
I(ati j) C K the interval which contains ay but no other root of r^. Similar as in BlSOLVE, we 
further refine the intervals /(ay), i = 1, . . . , deg(i?) and j = 1, . . . , £i, such that all of them 
are pairwise disjoint. Then, for each pair a and a' of consecutive roots of R defining an open 
interval / = (a, a'), we choose a separating rational value qi in between the corresponding 
isolating intervals. 

3.2. Lift 

Isolating the roots of the intermediate polynomials f(qi,y) is straight-forward because each 
f(qi, y) is a square-free polynomial with rational coefficients, and thus the Descartes method 
directly applies. 

Determining the roots of f a (y) := f(a,y) G at an x-critical value a is considerably 
more complicated because f a has multiple roots and, in general, irrational coefficients. One 
of the main contributions of this paper is to provide novel methods to compute the fiber at 
an x-critical value x = a. More precisely, we first present a complete and certified method 
LlFT-BS which is based on BlSOLVE (taken from Section pp. It applies to any input curve 



(without assuming generic position) and any corresponding x-critical value; see Section 3.2.1 



In Section 3.2.2, we further present a certified symbolic-numeric method denoted LlFT-NT. 
Compared to LlFT-BS, it shows better efficiency in practice, but it may fail for a few fibers if 
the input curve is in a special geometric situation. We further provide a method in order to 
easily check in advance whether LlFT-NT will succeed, and we also prove that this can always 
be achieved by means of a random coordinate transformation. As already mentioned in the 
introduction, we aim to avoid such a transformation for efficiency reasons. Hence, we propose 
to combine both lifting methods in way such that LlFT-NT runs by default, and, only in case 
of its failure, we fall back to LlFT-BS. 

3.2.1. LlFT-BS — a complete method for fiber computation 

LlFT-BS is based on the algorithm BlSOLVE to isolate the real solutions of a system of 
two bivariate polynomials f,g G Z[x,?/]. Recall that BlSOLVE returns a set of disjoint boxes 
Bi,...,B m C M 2 such that each box Bi contains exactly one real solution £ = (xo,yo) °f 
f(x,y) = g(x,y) = 0, and the union of all Bj covers all solutions. Furthermore, for each 
solution £, BlSOLVE provides square-free polynomials p, q G Z[x] with p(xo) = q(yo) = and 
corresponding isolating (and refineable) intervals I(xq) and I(yo) for xq and yo, respectively. 
Comparing £ with another point £ = (xi,yi) G 1R 2 given by a similar representation is rather 
straight-forward. Namely, let p,q G Z[x] be corresponding defining square-free polynomials and 
I(x\) and I(y\) isolating intervals for X\ and yi, respectively, then we can compare the x- and 
^/-coordinates of the points £ and £ via gcd-computation of the defining univariate polynomials 
and sign evaluation at the endpoints of the isolating intervals (see |24[ Algorithm 10.44] for 
more details). 

In order to compute the fiber at a specific real x-critical value a of C, we proceed as 
follows: We first use BlSOLVE to determine all solutions Pi — (a, /3A i — 1, . . . , I, of the system 



Either by square-free factorization, or full factorization 
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/ = fy = with x-coordinate a. Then, for each pi, we compute 



ki := min{k : f yk (a,Pi) = -^(a,^) ^ 0} > 2. 

The latter computation is done by iteratively calling BlSOLVE for f y = f y 2 = 0, f y 2 = f y 3 = 0, 
and so on, and, finally, by restricting and sorting the solutions along the vertical line x = a. 
We eventually obtain disjoint intervals I\, . . . ,Ii and corresponding multiplicities k\, . . . , ki such 
that (3j is a fcj-fold root of f a which is contained in Ij. The intervals Ij already separate the 
roots Pj from any other multiple root of f a , however, Ij might still contain ordinary roots of f a . 

Hence, we further refine each Ij until we can guarantee via interval arithmetic that -^r(a,y) 
does not vanish on Ij. If the latter condition is fulfilled, then Ij cannot contain any root of f a 
except f3 j due to the Mean Value Theorem. Thus, after refining Ij, we can guarantee that Ij is 
isolating. It remains to isolate the ordinary roots of f a : 

We consider the so-called Bitstream Descartes isolator [35j (BDC for short) which constitutes 
a variant of the Descartes method working on polynomials with interval coefficients. This 
method can be used to get arbitrary good approximations of the real roots of a polynomial with 
"bitstream" coefficients, that is, coefficients that can be approximated to arbitrary precision. 
Bdc starts from an interval guaranteed to contain all real roots of a polynomial and proceeds 
with interval subdivisions giving rise to a subdivision tree. Accordingly, the approximation 
precision for the coefficients is increased in each step of the algorithm. Each leaf of the tree 
is associated with an interval / and stores a lower bound 1(1) and an upper bound u(I) for 
the number of real roots of f a within this interval based on Descartes' Rule of Signs. Hence, 
u(I) = implies that I contains no root and thus can be discarded. If 1(1) = u(I) = 1, then 
/ is an isolating interval for a simple root. Intervals with u(I) > 1 are further subdivided. 
We remark that, after a number of iterations, BDC isolates all simple roots of a bitstream 
polynomial, and intervals not containing any root are eventually discarded. For a multiple root 
£, BDC determines an interval / which approximates £ to an arbitrary good precision but never 
certifies such an interval / to be isolating. 

Now, in order to isolate the ordinary roots of f a , we modify BDC in the following way: We 
discard an interval / if one of following three cases applies: i) u(I) = 0, or ii) / is completely 
contained in one of the intervals Ij, or iii) I contains an interval Ij and u(I) < kj. Namely, 
in each of these situations, I cannot contain an ordinary root of f a . An interval I is stored 
as isolating for an ordinary root of f a if /(/) = u(I) = 1, and / intersects no interval Ij. All 
intervals which do not fulfill one of the above conditions are further subdivided. In a last step, 
we sort the intervals Ij (isolating the multiple roots) and the newly obtained isolating intervals 
for the ordinary roots along the vertical line. 

We remark that, in our implementation, BlSOLVE applied in LlFT-BS reuses the resul- 
tant res(/, f y ; y) which has already been computed in the projection phase of the algorithm. 
Furthermore, it is a local approach in the sense that its cost is almost proportional to the 
number of a;-critical fibers that have to be considered. This will turn out to be beneficial in 



the overall approach, where most fibers can successfully be treated by LlFT-NT; see Section 3.2.3 



3.2.2. Lift-NT — a symbolic-numeric approach for fiber computation 

Many of the existing algorithms to isolate the roots of f a (y) = f(a,y) are based on the 
computation of additional (combinatorial) information about f a such as the degree k = k a of 
gcd(/o,, f' a ), or the number m = m a of distinct real roots of f a ; for instance, in [13], the values m 
and k are determined by means of computing a subresultant sequence before using a variant of 
the BDC method (denoted m-fc-Descartes) to eventually isolate the roots of /„. Unfortunately, 
the additional symbolic operations for computing the entire subresultant sequence have turned 
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out to be very costly in practice. The following consideration will show that the number n a 
(= deg(/ a ) — k a ) of distinct complex roots of f a can be computed by means of resultant and 
gcd computations, and a single modular subresultant computation only. In order to do so, we 
first compute an upper bound n+ for each n a , where n+ has the following property: 

If C has no vertical asymptote at x = a, and each critical point (a, 0) (i.e. f x (a, (3) 
= f y (ot, (5) = 0) on the vertical line x = a is also located on C, then n a = ri£- (3-1) 



We will later see that the condition in (3.1) is always fulfilled if C is in a generic location. From 
our experiments, we report that, for almost all considered instances, the condition is fulfilled for 
all fibers. Only for a very few instances, we observed that n a ^ n+ for a small number of fibers. 
In order to check in advance whether n a = n+ for all x-critical values a, we will later introduce 
an additional test that uses a single modular computation and a semi-continuity argument. 

Computation of n+ . The following result due to Teissier [TT1 118] is crucial for our approach: 

Lemma 2 (Teissier). For an x-critical point p = (a, (5) of C , it holds that 

mult(/(a,2/),/3) = Int(/,/ v ,p) - lnt(f x ,f y ,p) + 1, (3.2) 

where mult(/(a, y), (3) denotes the multiplicity of (3 as a root of f(a,y) G lnt(f,f y ,p) 
the intersection multiplicity^ of the curves implicitly defined by f = and f y = at p, and 
Int(/ X , fyiP) the intersection multiplicity of f x = and f y = 0at p. 

Remark 1. In the case, where f x and f y share a common non-trivial factor h = gcd(f x , f y ) G 
7j[x,y}\/L, h does not vanish on any x-critical point p of C , that is, the curves h = and f = 
only intersect at infinity. Namely, h(p) = for some p G C 2 would imply that Int(/ X , f y ,p) = oo 
and, thus, lnt(f,f y ,p) = oo as well, a contradiction to our assumption on f to be square-free. 
Hence, we have lnt(f x ,f y ,p) = lnt(f*,f*,p) with f* := f x /h and f* := f y /h. Hence, the 



following more general formula (which is equivalent to (3.2) for trivial h) applies: 

mult(/(«, y),/3) = lnt(f,f y ,p) - Int(/;, />) + 1. (3.3) 

We now turn to the computation of the upper bound n+. We distinguish the cases deg/ a ^ 
deg^ / and deg/ a = deg y f. In the first case, where C has a vertical asymptote at a, we define 
n+ := deg which is obviously an upper bound for n a . In the case deg = deg v /, the formula 



(3.3) yields: 

n a = ^{distinct complex roots of f a } = deg y f - deggcd(/(a, y), f y (a, y)) 
= deg y f- £(mult(/(a,y),/?)-l) 



pec-. 

/(a,/8) = 



de §,/- E (lnt(/,/„ (a, /3))-Int(/:, /;,(«,/?))) 



(a, /3) is x-critical 

= deg v /-mult(i2,a)+ ^ Int {a, /?)) (3.4) 

(a, /3) is x-critical 

< deg, / - mult( J R, a) + J2 l ^(f^ fy, («, Z 3 )) (3-5) 
= deg, / — mult(.R, a) + mult(Q, a) —'■ (3.6) 

9 The intersection multiplicity of two curves / = and g = at a point p is defined as the dimension of the 
localization of C[x, y]/(f,g) at p, considered as a C-vector space. 
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where R(x) = res(f,f y ;y) and Q(x) := res(/*, /*; y). The equality (3.4) is due to the fact 
that / has no vertical asymptote at a and, thus, the multiplicity mult(i?, a) equals the sum 



^ ( g eC Int((/, / y , (a, /?)) of the intersection multiplicities of / and f y in the fiber at a. (3.6) 
follows by an analogous argument for the intersection multiplicities of /* and /* along the 
vertical line at a. From the square-free factorization of R, the value mult(i?, a) is already 
computed, and mult(Q, at) can be determined, for instance, by computing Q, its square-free 
factorization and checking whether a is a root of one of the factors. The following theorem 
shows that, if the curve C is in generic position, then C has no vertical asymptote or a vertical 
line, and /* and /* do not intersect at any point above a which is not located on CP^l In the 



latter case, the inequality (3.5) becomes an equality, and thus n a = n 



Theorem 5. For a generic s6M (i.e. for all but finitely many), the sheared curve 

C s :={(x,y)ER 2 :f(x + s -y,y) = 0} 
yields n+ = n a for all x-critical values a of C s . 

Proof. For a generic s, the leading coefficient of f(x + sy, y) (considered as a polynomial in y) is 
a constant, hence we can assume that C has no vertical asymptote and contains no vertical line. 
We can further assume that f x and f y do not share a common non-trivial factor h. Otherwise, 
we have to remove h first; see also Remark [T] Let g(x,y) = f(x + sy,y) £ y] denote 
the defining equation of the sheared curve C s , then the critical points of C s are the common 
solutions of 

g x (x,y) = f x (x + sy,y) = and g y (x,y) = f x (x + sy) ■ s + f y (x + sy,y) = 0. 

Hence, the critical points of C s are exactly the points (a',/3') = (a — s/3,/3), where (a, 0) is 
a critical point of C. We now consider a specific (a, 0) and show that, for a generic s, the 
polynomial g(a', y) has either no multiple root or exactly one multiple root at y — — where 
(a', 0) = (a — s0 0) denotes the corresponding critical point of C s . Then, the same holds for 
all critical values (a', 0) in parallel because there are only finitely many critical (a, (3) for C. 
Hence, from the definition of n^,, it then follows that n^, = n a ' for all x-critical values a' of 
C s . W.l.o.g., we can assume that (a, 0) = (0,0), and thus {pc',0) = (0,0) for the corresponding 
critical point of C s . Let y m be the highest power of y that divides g(0,y) = f(sy,y), and define 
f*(s,y) := f{sy,y)/y m . If there exists an so £ R such that f*(so,y) has no multiple root, then 
we are done. Otherwise, for each s, f*(s, y) has a multiple root yo that is different from 0. It 
follows that f*(s,y) is not square-free, that is, there exist polynomials pi,p2 £ C[s, ?/] with 

F(s,y)= f -^^=pi(s,y)-p 2 (s,y) (3.7) 

y 

We remark that, for each s £ C, there exists a y s £ C\{0} such that pi(s,y s ) = 0. Hence, for 
x s := s/y s , we have p\{x s /y s ,y s ) = 0, and thus pi(x/y,y) cannot be a power of y. Now plugging 



s = x/y, with y ^ 0, into (3.7) yields 

f(x,y) =y m -pl(x/y,y)-p 2 (x/y,y) = y m ■ (^^-) ■ = y m ~^~^ .p\{x, y) -p 2 (x, y), 

where pi,p 2 £ C[x, y], and m 1 ,m 2 £ N. Since f(x,y) is square-free, this is only possible if 
pi(x,y) is a power of y. This implies that p\(x/y,y) = pi(x,y)/y mi is also a power of y, a 
contradiction. □ 



10 The reader may notice that generic position is used in a different context here. It is required that all 
intersection points of /* and /* above an a:-critical value a are located on the curve C. 
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We remark that, in the context of computing the topology of a planar algebraic curve, 



Teissier's formula has already been used in [T2| [T6]. There, the authors apply (3.2) in its simpli- 
fied form (i.e. Int(/ X , f y ,p) = 0) to compute mult(/3, f(a,y)) for a non-singular point p = (a, {$). 
In contrast, we use the formula in its general form and sum up the information along the entire 
fiber which eventually leads to the upper bound n+ on the number of distinct complex roots of f a . 

In the next step, we provide a method to check in advance whether the curve C is in a 
generic position in the sense of Theorem [5j Unfortunately, we see no cheap way to check generic 
position with respect to a specific x-critical fiber x = a, that is, whether n+ matches n a for 
a specific a. However, we can derive a global test to decide whether the upper bound n+ 
matches n a for all fibers. While the evaluation of the corresponding test with exact integer 
arithmetic is expensive, we can use the same argument to derive a conservative modular test 
which returns the same answer with very high probability. The test relies on the comparison 
of an upper bound N + for Y2 a n a (i- e - N + > Y2 a n a — ^ := So Ua ) an( ^ a l° wer bound N~ 
for N (i.e. N~ < N = Yl a n a)' wnere we sum over an (complex) x-critical values a. Then, 
N~ = N + implies that n a = n+ for all a. We now turn to the computation of N~ and N + . 
Here, we assume that / has no vertical asymptote and no vertical component (in particular, 
deg^ f(a, y) = deg y f(x, y) =: n y for all values a.). 

Computation o/N + . 

Lemma 3. The sum over all n+, a a complex x-critical value of C , yields: 

(deg x W ■ deg y f) - deg, R + deg, gcd(i?°°, Q), 

where Q = res(/*, /*; y), and gcd(i?°°, Q) is defined as the product of all common factors of R 
and Q with multiplicity according to their occurrence in Q. 

Proof. For the first term, note that deg^, R* is the number of distinct complex x-critical values 
for / and, thus, the number of summands in ^2 a n a . The sum over all multiplicities mult(i?, a) 
for the roots a of R simply yields the degree of R. Finally, the summation over mult (Q, a) 
amounts to removing the factors of Q that do not share a root with R. □ 

We remark that the square-free part R* of the resultant R is already computed in the 
projection phase of the curve analysis, and thus we already know deg^i?*. The additional 
computation of Q and gcd(-R°°, Q) can be performed over a modular prime field Z p for some 
randomly chosen prime p. Then, deg a .(gcd(-R°° mod p, Q mod p)) > deg x gcd(-R°°, Q), and thus 

N + := (deg x R* ■ deg y f) - deg x R + deg^gcd^ 00 mod p, Q mod p)) (3.8) 

constitutes an upper bound for ^2 a n^- We remark that the result obtained by the modular 
computation matches Y2 a n a with very high probability. That is, up to the choice of finitely 
many "unlucky" primes, we have N + = ^a- 

In the next step, we show how to compute a lower bound iV~ for N. In order to understand 
its construction, we first explain how to exactly compute N. We stress that our algorithm never 
performs this computation. 

(Exact) Computation of N. Consider a decomposition of the square- free part R* of the 
resultant R = res(/, f y ; y): 

R* — R X R 2 ■ ■ ■ R s , RieZ[x], (3.9) 
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such that Ri(a) = if and only if f(a, y) has exactly n y — i distinct complex roots. Note that 
all Ri are square- free and pairwise coprime. With di := deg-Rj the degree of the factor Ri, it 
follows that 



N 



Ki<r 



The computation of the decomposition in (3.9) uses subresultants . The i-th subresultant 
polynomial SreSj(/, g; y) G 7L\x,y\ of two bivariate polynomials / and g with y-degrees m y and 
n y , respectively, is defined as the determinant of a Sylvester-like matrix. 



$resi(f,f y ;y) :-- 



(y) 
in,. 



Ay) 

J m v —l 



(y) 



Ay) Ay) 
y n v yn y -i 



Av) 

yn y 



J2i 



(V) 



(+2 v^-'f 



Ay) 

Ji+l 

Ay) 

iJ2i-m y +2 



Ay) 
i/i+i 



y m y -i-l g 



n,, 



m,. 



i rows 



i rows 



The subresultants exhibit a direct relation to the number and multiplicities of common roots 
of / and g. More specifically, it holds that deggcd(/(a, y),g(a, y)) = k if and only if the i-th 
principal subresultant coefficient (psc) srj(x) := sresi(f,g;y) := coeffj(SreSi(/, g; y); y) G 
vanishes at a for all i = 0, . . . , k — 1, and sr^(a) ^ (e.g. see [36| 137] for a proof). 



Thus, the decomposition in (|3.9|) can be derived as 

Sq 



R* 



R l :-- 



Si := gcd(S , i _i,sr i ) 

gcd^o, . . . , Si-i) 



gcd(S ,Si 



Ri ■ 



gcd(S'o, . . . , Si-t, Si 



for i 
for i 



l,...,s, 



l,...,s, 



(3.10) 
(3.11) 



where s is the number of non-trivial entries in the subresultant sequence of / and f y . The 
computation of iV as described here requires the exact computation of all psc's, a very costly 
operation which would affect the overall runtime considerably. Instead, we consider the following 
modular approach: 

Computation of N~ . The main idea of our approach is to perform the above subresultant 
computation over Z p for a single, randomly chosen prime p. More precisely, we denote 



sr (p) ? x \ 



sre Si (p) (/ (p) ^ (p) ;i/) := coeff^Sresf gW; y); y) G Z p [x 



the i-th principle subresultant coefficient in the subresultant sequence of := / mod p G 
7L v \x,y\ and := g mod p G X p [x,y]. The polynomials S^ G 1i p [x] and R^ G are then 

defined in completely analogous manner as the polynomials Si G Z[x] and Ri G Z[x] in (3.10) 
and (3.11), respectively. The following lemma shows that this yields a lower bound for iV if p 
does not divide the leading coefficient of / and f y : 



Lemma 4. Let p be a prime that does not divide the leading coefficient of f and f y , and let 
of := degi?^ denote the degree of . Then, 



n,, 



i) -d, 



(p) 



(3.12) 



constitutes a lower bound for the total number N of distinct points on C in x- critical fibers. 
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Proof. It suffices to show that ^>i * ' di — Si>i * ' ^! i ■ Namely, using d := deg-R* = Yli>i di = 
Yli>i di , we obtain 

N~ = S ^ / (n y — z) • = n y d — i ■ df 1 < n^d — i ■ di = ^^(n y — i) ■ d^ = N. 

i>l i>l i>l »>1 

Since p does not divide the leading coefficient of / and / y , we have sr-^ = sr; modp due to the 
specialization property of subresultants. Hence, nf* := degS^ > i%i := degS; which implies 
the following diagram (with some t such that s <t < n) 

d = 4 P) > > ■ ■ ■ > > > • • • > nj p) = 

II IV IV IV II 

d = tiq > ni > ' ■ • > n s = n s+ i = ■ ■ ■ = n t =0 

Furthermore, we have di = 7ij_i — rti and = nf\ — nf . Thus, 

s s 

i ■ di = = ^^( n j-i ~~ n j) = = n « (since rij = for i > s) 

i>l i>l j>i i>l j>i i>l i>0 

and, analogously, J2i>i i ' <4 P) = Si>o n f ] - Tllis shows X^i>i * • <k < J2i>i i ' <4 P) - □ 

We remark that, for all but finitely many (unlucky) choices of p, all po lynom ials Ri and 
have the same degree. Thus, with high probability, N~ as defined in (3.12) matches N. 
In addition, also with very high probability, we have N + = Yl a n a- Hence, if the curve C is 
in generic position and our choice of p is not unlucky, then N~ = N + = N, and thus we can 
certify in advance that n a = n+ for all x-critical values a. We would like to emphasize that the 
only exact computation (over Z) that is needed for this test is that of the square-free part of the 
resultant R (more precisely, only that of its degree). All other operations can be performed over 
Z p for a single, randomly chosen prime p. Putting everything together now yields our method 
Lift- NT to compute the fiber at an x-critical value: 

LlFT-NT. We consider a hybrid method to isolate all complex roots and, thus, also the real 
roots of f a (y) = f(oz,y) G R[y], where a is a real valued x-critical value of the curve C. It 
combines (a) a numerical solver to compute arbitrary good approximations (i.e. complex discs 
in C) of the roots of f a , (b) an exact certification step to certify the existence of roots within 
the computed discs, and (c) additional knowledge on the number n a of distinct (complex) roots 
of f a . LlFT-NT starts with computing the upper bound n+ for n a and the values N~ and N + 



as defined in (3.6), (3.12), and (3.8), respectively. We distinguish two cases: 



N~ = N + : In this case, we know that n a = nj. We now use a numerical solver to 
determine disjoint discs Di, . . . , D m C C and an exact certification step to certify the 
existence of a certain number m, > 1 of roots (counted with multiplicity) of f a within 



each Df, see Appendix A for details. Increasing the working precision and the number of 
iterations within the numerical solver eventually leads to arbitrary well refined discs Di - 
but without a guarantee that these discs are actually isolating! However, from a certain 
iteration on, the number of discs certified to contain at least one root matches n a . When 
this happens, we know for sure that the D^s are isolating. We can then further refine 
these discs until, for all i = 1, . . . , m, 

D t n R = or A H Dj = for all j ^ i, (3.13) 
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where Di := {z : z G Di} denotes the complex conjugate of Di. The latter condition 
guarantees that each disc Di which intersects the real axis actually isolates a real root of 
f a . In addition, for each real root isolated by some Di, we further obtain its multiplicity 
rrii as a root of f a . 

N~ < N + : In this case, we have either chosen an unlucky prime in some of the modular 



computations, or the curve C is located in a special geometric situation; see (3.1) and 
Theorem [5j However, despite the fact that there might exist a few critical fibers where 
n a < n t > there is still a good chance that equality holds for most a. Hence, we propose to 
use the numerical solver as a filter in a similar manner as in the case, where N~ = N + . 
More precisely, we run the numerical solver on f a for a certain number of iterations p] 
Since n+ constitutes an upper bound on the number of distinct complex roots of f a , we 
must have m < n a < n+ at any time. Hence, if the number m equals n+, we know for 
sure that all complex roots of f a are isolated and can then proceed as above. If, after a 
number of iterations, it still holds that m < LlFT-NT reports a failure. 

Lift-NT is a certified method, that is, in case of success, it returns the mathematical correct 
result. However, in comparison to the complete method LlFT-BS, LlFT-NT may not apply to 
all critical fibers if the curve C is in a special geometric situation. We would like to remark 
that, for computing the topology of the curve C only, we can exclusively use LlFT-NT as the 
lifting method. Namely, when considering, as indicated earlier, an initial shearing x i— >■ x + s ■ y, 
with s a randomly chosen integer, the sheared curve 

C s := {(x,y) eR 2 : f(x + s ■ y,y) = 0} 



is in generic situation (with high probability) due to Theorem [5j Then, up to an unlucky choice 
of prime numbers in the modular computations, we obtain bounds iV~ and iV + for iV which are 
equal. Hence, up to an unlucky choice of finitely many "bad" shearing parameters s and primes 
p, the curve C s is in a generic situation, and, in addition, we can actually prove this. It follows 
that n+ = n a for all x-critical values of the sheared curve C s , and thus LlFT-NT is successful 
for all fibers. Since the sheared curve is isotopic to C, this shows that we can always compute 
the topology of C by exclusively using LlFT-NT during the lifting phase. 

3.2.3. LIFT— Combining LlFT-BS and LlFT-NT 

We have introduced two different methods to compute the fibers at the x-critical values of a 
curve C. LlFT-BS is certified and complete, but turns out to be less efficient than LlFT-NT 
which, in turn, may fail for a few fibers for curves in a special geometric situation. Hence, in 
the lifting step, we propose to combine the two methods. That is, we run LlFT-NT by default, 
and fall back to LlFT-BS only if Lift- NT fails. In practice, as observed in our experiments 



presented in Section 6.2, the failure conditions for LlFT-NT are almost negligible, that is, the 
method only fails for a few critical fibers for some curves in a special geometric situation. In 
addition, in case of a failure, we profit from the fact that our backup method LlFT-BS applies 
very well to a specific fiber. That is, its computational cost is almost proportional to the number 
of fibers that are considered. 

We also remark that, for the modular computations of N~ and N + , we never observed any 
failure when choosing a reasonable large prime. However, it should not be concealed that we only 
performed these computations off-line in Maple. Our C++-implementation still employs a more 
naive approach, where we always use LlFT-NT as a filter as described in the case N~ < iV + 
above. 



11 The threshold for the number of iterations should be chosen based on the degree of / and its coefficient's 
bitlengths. For the instances considered in our experiments, we stop when reaching 2048 bits of precision. 
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a I — (a,a^) a' a 1(a) b I' (a) 



Figure 3.2: The left figure shows the generic case, where exactly one x-critical point (p%) above a 
exists. The bottom-up method connects A\ to pi and A 2 to p 2l the remaining arcs have to pass 
p 3 . In the second figure, the fiber at a contains two critical points p 2 and p%. The red horizontal 
line segments pass through arbitrary chosen points (a,ti) separating and pi. The initial 
isolating interval 1(a) = (a, b) for a is not sufficient to determine the connections for all arcs 
since A\, A 2 , A3 intersect the segments I x {tj}. On the right, the refined isolating interval I 1 (a) 
induces boxes I' (a) x (ti,t i+ i) small enough such that no arc crosses the horizontal boundaries. 
By examination of the ^-coordinates of the intersections between the arcs and the fiber over the 
right-hand boundary of I' (a) (red crosses), we can match arcs and critical points. 

In the last section, we mentioned that LlFT-NT can be turned into a complete method when 
considering an initial coordinate transformation. Hence, one might ask why we do not consider 
such a transformation to compute the topology of C. There are several reasons to not follow 
this approach. Namely, when considering a shearing, the algorithm computes the topology of C, 
but does not directly yield a geometric-topological analysis of the curve since the vertices of 
the so-obtained graph are not located on C. In order to achieve the latter as well, we still have 
to "shear back" the information for the sheared curve, an operation which is non-trivial at all; 
see [13J for details. Even though the latter approach seems manageable for a single curve, it 
considerably complicates the arrangement computation (see Section [4]) because the majority 
of the input curves can be treated in the initial coordinates. Furthermore, in particular for 
sparse input, a coordinate transformation induces considerably higher computational costs in 
all subsequent operations. 



3.3. Connect 

Let us consider a fixed x-critical value a, the corresponding isolating interval 1(a) = (a, b) 
computed in the projection phase and the points p^ := (a,y a ^) G C, i = 1, . . . , m a , located on 
C above a. Furthermore, let I = (a, a') be the interval connecting a with the nearest x-critical 
value to the right of a (or +00 if none exists) and Aj, j — 1, . . . , m/, the j-th arc of C above / 
with respect to vertical ordering. Aj is represented by a point aj := (qi,yi,j) £ C, where yjj 
denotes the j-th real root of f(qi, y) and qj an arbitrary but fixed rational value in /. To its 
left, Aj is either connected to (a, ±00) (in case of a vertical asymptote) or to one of the points 
Pi. In order to determine the point to which an arc Aj is connected, we consider the following 
two distinct cases: 

• The generic case, that is, there exists exactly one real x-critical point pi above a and 
deg f(a,y) = deg y f. The latter condition implies that C has no vertical asymptote at a. 
Then, the points p%, . . . ,Pi -i must be connected with A\, . . . , A io _i in bottom-up fashion, 
respectively, since, for each of these points, there exists a single arc of C passing this point. 
The same argument shows that Pi +i, ■ ■ ■ ,p ma must be connected to A mi - ma +i +i, ■ ■ ■ , A mi 
in top-down fashion, respectively. Finally, the remaining arcs in between must all be 
connected to the x-critical point pi . 
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— • X . — ► x — . >< » » — • * X — • X » X • — > 

Figure 3.3: The two figures on the left show the topology analyses for the curves C = V(f) 
and D = V(g). The second figure from the right shows the intersection of the two curves. For 
the curve pair analysis, critical event lines (at dots) are sorted and non-critical event lines (at 
crosses) in between are inserted. Finally, for each event line x = a, the roots of f(a,y) and 
g(a,y) are sorted. The latter task is done by further refining corresponding isolating intervals 
(blue or red intervals) and using the combinatorial information from the curve analyses and the 
computation of the intersection points. 

• The non-generic case: We choose arbitrary rational values t 1; . . . ,t ma+ i with t% < y a> i < 
t 2 < ■ ■ ■ < y a ,rn a < tm a +i- Then, the points pi := (a, tj) separate the p^s from each other. 
Computing such pi is easy since we have isolating intervals with rational endpoints for 
each of the roots y a ^ of f(a,y). In a second step, we use interval arithmetic to obtain 
intervals ( 3f(I(a) x U) cR with f(I(a) x tj) C *Bf(I(a) x ti). As long as there exists 
an i with £ 5S/(/(a) x ti), we refine 1(a). Since none of the pi is located on C, we 
eventually obtain a sufficiently refined interval 1(a) with ^ 53 /(/(a) x tj) for all i. It 
follows that none of the arcs Aj intersects any line segment 1(a) x tj. Hence, above 1(a), 
each Aj stays within the rectangle bounded by the two segments 1(a) x tj and 1(a) x tj 0+ i 
and is thus connected to pi . In order to determine we compute the j-th real root jj of 
f(b,y) £ Q[y] and the largest %q such that jj > tj . In the special case where jj < ti or 
7j > tj for all i, it follows that Aj is connected to (a, -co) or (a, +oo), respectively. 

For the arcs located to the left of a, we proceed in exactly the same manner. This concludes 
the connection phase and, thus, the description of our algorithm. 




4. Arrangement computation 

CGAL's prevailing implementation for computing arrangements of planar algebraic curves 
reduces all required geometric constructions (as intersections) and predicates (as comparisons of 
points and x-monotone curves) to the geometric-topological analysis of a single curve [13] and 
pairs of curves [1J; see also [38J and CGAL's documentation [2J. 

In Section |3j we have already seen how to improve the curve-analysis. In a similar way, we 
want to increase the performance of the analyses of a pair of curves C = V(f) and D = V(g), 



(see illustration in Figure 3.3). In general, the algorithm from [T] had to compute the entire 
subresultant sequence, an operation that we are aiming to avoid. Using the new analyses of each 
single curve and combining the so-obtained information with the information on the intersection 
points of the two curves C and D as returned by BlSOLVE, it is straight-forward to achieve this 
goal. We mainly have to compute the common intersection points of the two curves: 

Let C = V(f) and D = V(g) be two planar algebraic curves implicitly defined by square-free 
polynomials /, g £ "L[x,y\. The curve analysis for C provides a set of x-critical event lines 
x = a. Each a is represented as the root of a square- free polynomial rj, with a factor of 
Rc '■= res(/, f y ;y), together with an isolating interval 1(a). In addition, we have isolating 
intervals for the roots of f(a,y). A corresponding result also holds for the curve D with 
Rd '■= res(g, g y ;y). For the common intersection points of C and D, a similar representation is 
known. That is, we have critical event lines x = a', where a' is a root of a square-free factor 
of Rcd '■= T es(f,9',y) and, thus, f(a,y) and g(a,y) share at least one common root (or the 
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their leading coefficients both vanish for x = a). In addition, isolating intervals for each of 
these roots have been computed. The curve-pair analysis now essentially follows from merging 
this information. More precisely, we first compute merged critical event lines (via sorting 
the roots of Rq, Rd and Rod) and, then, insert merged non- critical event lines at rational 
values qj in between. The intersections of C and D with a non-critical event line at x — qi are 
easily computed via isolating the roots of f(qi,y) and g(qi,y) and further refining the isolating 
intervals until all isolating intervals are pairwise disjoint. For a critical event line x = a, we 
refine the already computed isolating intervals for f(a,y) and g(a,y) until the number of pairs 
of overlapping intervals matches the number m of intersection points of C and D above a. This 
number is obtained from the output of BlSOLVE applied to / and g, restricted to x = a. The 
information on how to connect the lifted points is provided by the curve analyses for C and D. 
Note that efficiency is achieved by the fact that BlSOLVE constitute (in its expensive parts) a 
local algorithm. 

We remark that, in the previous approach by Eigenwillig and Kerber [T], m is also determined 
via efficient filter methods, while, in general, a subresultant computation is needed if the filters 
fail. This is, for instance, the case when two covertical intersections of C and D occur. For our 
proposed lifting algorithms, such situations are not more difficult, and thus do not particularly 
influence the runtime. 

5. Speedups 

5.1. GPU- accelerated symbolic computations 

As mentioned in the introduction, one of the notable advantages of all our new algorithms 
over similar approaches is that it is not based on sophisticated symbolic computations (such 
as, for example, evaluating signed remainder sequences) restricting the latter ones to only 
computing bivariate resultants and gcds of univariate polynomials. In turn, these operations 
can be outsourced to the graphics hardware to dramatically reduce the overhead of symbolic 



arithmetic. In this section, we overview the proposed GPU 12 algorithms and refer to [TU| |9j [TT] 
for further details. 

At the highest level, the resultant and gcd algorithms are based on a modular or homo- 
morphism approach, first exploited in the works of Brown and Collins [40J. The modular 
approach is a traditional way to avoid computational problems, such as expression swell, shared 
by all symbolic algorithms. In addition, it enables us to distribute the computation of one 
symbolic expression over a large number of processor cores of the graphics card. Our choice of 
the target realization platform is not surprising because, with the released CUDA framework |41j . 
the GPU has become a new standard for high-performance scientific computing. 

To understand the main principles of GPU computing, we first need to have a look at the 
GPU architecture. Observe that the parallelism on the graphics processor is supported on two 
levels. At the upper level, there are numerous thread blocks executing concurrently without any 
synchronization between them. There is a potentially unlimited number of thread blocks that 
can be scheduled for execution on the GPU. These blocks are then processed in a queued fashion 
by the hardware. This realizes block-level parallelism. For its part, each thread block contains a 
limited number of parallel threads (up to 1024 threads on the latest GPUs) which can cooperate 
using on-chip shared memory and synchronize the execution with barriers. This is referred to as 
thread-level parallelism. An important point is that individual threads running on the GPU are 
"lite-weight" in a sense that they do not possess large private memory spaces, neither they can 
execute disjoint code paths without penalties. The conclusion is that an algorithm to be realized 
on the graphics card must exhibit a high homogeneity of computations such that individual 
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threads can perform the same operations but on different data elements. We start our overview 
with the resultant algorithm. 



Computing resultants in Z[x, y] . Given two bivariate polynomials /, <? G Z[x, y], the modular 
resultant algorithm of Collins can be summarized in the following steps: 

(a) apply a modular homomorphism to map the coefficients of / and g to a finite field for 
sufficiently many primes p: "L\x,y\ — >■ Z p [x,y]; 

(b) for each modular image, choose a set of points G Z p , i & I, and evaluate the polynomials 



ell X 



a 



(0 



(evaluation homomorphism): Z p [x, y] — > Z* p [x,y}/(x — a 



(c) compute a set of univariate resultants in Z p [x] in parallel: res y (f, g)\ « : Z m [x,y]/(x — 

4°) -)> Zp[x]/(x - 4°); 

(d) interpolate the resultant polynomial for each prime p in parallel: Z p [x]/(x — a m (i)) — > Z p [x]; 

(e) lift the resultant coefficients by means of Chinese remaindering: Z p [x] — > Z[x]. 

Steps (a)-(d) and partly (e) are outsourced to the graphics processor, thereby minimizing the 
amount of work on the host machine. In essence, what remains to be done on the CPU, is to 
convert the resultant coefficients in the mixed-radix representation (computed by the GPU) to 
the standard form. 

Suppose we have applied modular and evaluation homomorphisms to reduce the resultant 
of / and g to N univariate resultants in Z p [x] for each of M moduli. Thus, provided that 
the modular images can be processed independently, we can launch a grid of N x M thread 
blocks with each block computing the resultant of one modular image. Next, to compute 
the univariate resultants, we employ a matrix-based approach instead of the classical PRS 
(polynomial remainder sequences) used by Collins' algorithm. One of the advantages of this 
approach is that, when a problem is expressed in terms of linear algebra, all data dependencies 
are usually made explicit, thereby enabling thread- level parallelism which is a key factor in 
achieving high performance. 

More precisely, the resultants of the modular images are computed by direct factorization of 
the Sylvester matrix using the so-called Schur algorithm which exploits the special structure of 
the matrix. In order to give an idea how this algorithm works, let /, # G Z[x] be polynomials of 
degrees m and n, respectively. Then, for the associated Sylvester matrix S G Z rxr (r = m + n), 
one can write the following displacement equation |42| : 



S — Z T S[Z m © Z n ) T = GB 1 



(5.1) 



where Z s G Z sxs is a down-shift matrix zeroed everywhere except for l's on the first subdiagonal, 
© denotes the Kronecker sum, and G,B G Z rx2 are the generator matrices whose entries can 



be deduced from S by inspection. For illustration, we can write (5.1) in explicit form setting 
m = 4 and n = 3: 
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The matrix on the right-hand side has rank 2, and hence it can be decomposed as a product of 
r x 2 and 2 x r matrices G and B T . The idea of the Schur algorithm is to rely on this low-rank 
displacement representation of a matrix to compute its factorization in an asymptotically fast 
way. Particularly, to factorize the matrix S, this algorithm only demands for 0(r 2 ) operations 
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in Z; see [421 p. 323]. In short, the Schur algorithm is an iterative procedure: In each step, it 
transforms the matrix generators into a "special form" from which triangular factors can easily 
be deduced based on the displacement equation (5.1). Using division-free modifications, this 
procedure can be performed efficiently in a prime field giving rise to the resultant algorithm 
in 7j p [x}; its pseudocode (serial version) can be found in [HI Section 4.2]. Now, to port this to 
the GPU, we assign one thread to one row of each of the generator matrices, that is, to four 
elements (because G, B G Z rx2 ). In each iteration of the Schur algorithm, each thread updates 
its associated generator rows and multiplies them by a 2 x 2 transformation matrix. Altogether, a 
univariate resultant can be computed in 0(r) finite field operations using r processors (threads). 
This explains the basic routine of the resultant algorithm. 

The next step of the algorithm, namely polynomial interpolation in Z p , can also be performed 
efficiently on the graphics card. Here, we exploit the fact that interpolation is equivalent to 
solving a Vandermonde system, where the Vandermonde matrix has a special structure. Hence, 
we can again employ the Schur algorithm to solve the system in a small parallel time, see [9] 
Section 4.3]. Finally, in order to obtain a solution in 1i[x], we apply the Mixed- Radix Conversion 
(MRC) algorithm |43] which reconstructs the integer coefficients of the resultant in the form of 
mixed-radix (MR) digits. The key feature of this algorithm is that it decouples operations in a 
finite field Z p from those in the integer domain. In addition, the computation of MR digits can 
be arranged in a very structured way allowing for data-level parallelism which can be readily 
exploited to compute the digits on the GPU. 



Computing gcds in Z[x]. The modular gcd algorithm proposed by Brown follows a similar 
outline as Collins' algorithm discussed above. For f,g G Z[sc], it consists of three steps: 

(a) apply modular homomorphism reducing the coefficients of / and g modulo sufficiently many 
primes: Z[x] — > Z* p [x]; 

(b) compute a set of univariate gcds in Z p [x]: gcd(/, g) mod p : Z p [x] — > % p [x]; 

(c) lift the coefficients of a gcd using Chinese remaindering: Z m [x] — > 7*[x}. 

Again, we augment the original Brown's algorithm by replacing the Euclidean scheme (used to 
compute a gcd of each homomorphic image) with a matrix-based approach. The univariate gcd 
computation is based on the following theorem. 

Theorem 6. /44! Let S be the Sylvester matrix for polynomials f,g G ¥[x] with coefficients 
over some field ¥. If S is put in echelon fornf^\ using row transformations only, then the last 
non-zero row gives the coefficients ofgcd(f,g) G ¥[x}. 

Suppose / and g have degrees m and n, respectively. Theorem [6] asserts that if we triangulate 
the Sylvester matrix S G Z rxr (r = n + m), for instance, by means of Gaussian elimination, we 
obtain gcd(/, g) in the last nonzero row of the triangular factor. In order to achieve the latter, 
we apply the Schur algorithm to the positive-definite matrix W = S T S to obtain the orthogonal 
(QR) factorization of In terms of displacements, W can be written as follows [42] : 

W — Z r WZj = GJG T with G G Z rx4 , J = I 2 ®-I 2 . (5.2) 

Here, I s denotes an s x s identity matrix. Remark that it is not necessary to compute the entries 
of W explicitly because the generator matrix G is easily expressible in terms of the coefficients 
of / and g, see [TTj Section 2.2]. Similarly to the resultants, we can run the Schur algorithm 
for W in 0(r) time on the GPU using r processors (threads). That is, one thread is assigned 



13 A matrix is in echelon form if all nonzero rows are above any rows of all zeroes, and the leading coefficient 
of a nonzero row is always strictly to the right of the leading coefficient of the row above it. 
1 The reason why we do not triangularize S directly is elaborated upon in 
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Figure 5.1: (a) Intervals containing the roots of f(a,y) and g(a,y) are refined until they either 
do not overlap or are fully included in candidate boxes. In the former case, the boxes can be 
discarded, (b) Unvalidated candidates are passed to bidirectional filter which runs bitstream 
isolation in another direction 



to process one row of the generator matrix G (4 elements). The source code of a sequential 
algorithm can be found in [TT1 Algorithm 1]. 

From the theoretical perspective, the rest of the GPU algorithm essentially follows the same 
outline as the one for resultants, with the exception that there is no need for an interpolation 
step anymore since the polynomials are univariate. Certainly, there is also a number of practical 
difficulties that need to be clarified. One of them is computing tight upper bounds on the 
height of a polynomial divisor which is needed to estimate the number of moduli used by the 



algorithm ._ The existing theoretical bounds are very pessimistic, and the original algorithm 
by Brown relies on trial division to reduce the number of homomorphic images. However, 
this solution is incompatible with parallel processing because the algorithm must be applied 
incrementally. That is why, in the implementation, we use a number of heuristics to shrink the 
theoretical worst-case bounds. 

Another challenge relates to the fact that it is not always possible to compute the gcd 
of a modular image by a single thread block (recall that the number of threads per block is 
limited) while threads from different blocks cannot work cooperatively. Thus, we needed to 
introduce some "data redundancy" to be able to distribute the computation of a single modular 
gcd (factorization of the Sylvester matrix) across numerous thread blocks. The details can be 
found in the paper cited above. 

5.2. Filters for BlSOLVE 

Besides the parallel computation of resultants and gcds, the algorithm BlSOLVE to solve 
bivariate polynomial systems from Section [2] can be elaborated with a number of filtering 
techniques to early validate a majority of the candidates: 

As first step, we group candidates along the same vertical line (a fiber) at an x-coordinate a 
a root of RM) to process them together. This allows us to use extra information on the real 
roots of f(a,y) € M[y] and g(a,y) G for the validation of candidates. We replace the tests 
based on interval evaluation (see page [9]) by a test based on the bitstream Descartes isolator [35j 



(Bdc) (which has already been used in LlFT-BS; see Section 3.2.1). To do so, we apply BDC 
to both polynomials f(a,y) and g(a,y) in parallel, which eventually reports intervals that do 
not share common roots. This property is essential for our filtered version of VALIDATE: a 
candidate box B (a, (3) can be rejected as soon as the associated ^/-interval I(/3) fully overlaps 
with intervals rejected by Bdc for f(a,y) or g(a,y); see Figure 5.1 (a). 



15 



The height of a polynomial is defined as the maximal magnitude of its coefficients. 
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As alternative we could also deploy the numerical solver that is utilized in Lift-NT; see 



Appendix A for details. Namely, it can be modified in way to report active intervals, and thus 
allows us to discard candidates in non-active intervals. Even more, as the numerical solver 
reports all (complex) solutions, we can use it as inclusion predicate, too: If we see exactly one 
overlap of reported discs Aj and A g (one for f[a,y), the other for g(a,y), respectively), and 
this overlap is completely contained in the projection A(/3) of a candidate polydisc A(a) x A(f3), 
then (a, 0) must be a solution. Namely, f(a,y) and g(a,y) share at least one common root, 
and each of these roots must be contained in Af Pi A g . By construction, A((3) contains at most 
one root, and thus (3 must be the unique common root of the two polynomials. 

Grouping candidates along a fiber x = a also enables us to use combinatorial tests to discard 
or to certify them. First, when the number of certified solutions reaches mult (a), the remaining 
candidates are automatically discarded because each real solution contributes at least once 
to the multiplicity of a as a root of R {y) { see Theorem [l]). Second, if a is not a root of the 
greatest common divisor hS y \x) of the leading coefficients of / and g, mult (a) is odd, and all 
except one candidate along the fiber are discarded, then the remaining candidate must be a real 
solution. This is because complex roots come in conjugate pairs and, thus, do not change the 



parity of mult (a). We remark that, in case where the system (2.1) is in generic position and 
the multiplicities of all roots of R are odd, the combinatorial test already suffices to certify all 
solutions without the need to apply the inclusion predicate based on Theorem [4] 

Now, suppose that, after the combinatorial test, there are several candidates left within a 
fiber. For instance, the latter can indicate the presence of covertical solutions. In this case, 
before using the new inclusion predicate, we can apply the aforementioned filters in horizontal 
direction as well. More precisely, we construct the lists of unvalidated candidates sharing the 
same y-coordinate /3 and process them along a horizontal fiber. For this step, we initialize the 
bitstream trees (or the numerical solvers) for f(x,0) G M\x\ and g(x,/3) G and proceed 



in exactly the same way as done for vertical fibers; see Figure 5.1 (b). We will refer to this 



procedure as the bidirectional filter, especially in Section 6.1 where we examine the efficiency of 
all filters. The (few) candidates that still remain undecided after all filters are applied will be 
processed by considering the new inclusion predicate. 

6. Implementation and experiments 

Setup. We have implemented our algorithms in a branch of the bivariate algebraic kernel 



first released with Cgal 16 version 3.7 in October 2010 |35"| [2]. BlSOLVE is a completely new 
implementation, whereas, for GeoTop and the analyses of pairs, we only replaced the lifting 
algorithms in CGAL's original curve- and curve-pair analyses^ with our new methods based on 
LlFT-NT, LlFT-BSp^l and BlSOLVE. As throughout CGAL, we follow the generic programming 
paradigm which allows us to choose among various number types for polynomials' coefficients 
or intervals' boundaries and to choose the method used to isolate the real roots of univariate 
polynomials. For our setup, we rely on the integer and rational number types provided by 



Gmp 5.0.1] | and the highly efficient univariate solver based on the Descartes method contained 
in Rs 20 (by Fabrice Rouillier [27]), which is also the basis for Isolate in Maple 13 and later 
versions. 



r 



16 The Computational Geometry Algorithms Library, www.cgal.org. 
17 Note that those and our algorithms have Project and Connect in common. 



18 We remark, that the implementation of Lift-BS can be improved: each iteration of BlSOLVE can benefit 
from common factors that occur in the intermediate resultants, that is, for later iterations polynomials with 
smaller degree can be considered. 
19 Gm p: [http : //gmplib . org| 
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Rs: |http : //www . loria . f r/ equipes/vegas/r s 
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All experiments have been conducted on a 2.8 GHz 8- Core Intel Xeon W3530 with 8 MB of 
L2 cache on a Linux platform. For the GPU-part of the algorithm, we have used the GeForce 
GTX580 graphics card (Fermi Core). 



Symbolic Speedups. Our algorithms exclusively rely, as indicated, on two symbolic operations, 
that is, resultant and gcd computation. We outsource both computations to the graphics 
hardware to reduce the overhead of symbolic arithmetic which typically constitutes the main 



bottleneck in previous approaches. Details about this have been covered in Section 5.1 Beyond 
that, it is worth noting that our implementation of univariate gcds on the graphics card is 
comparable in speed with the one from NtlP] running on the host machine. Our explanation 
for this observation is that, in contrast to bivariate resultants, computing a gcd of moderate 
degree univariate polynomials does not provide a sufficient amount of parallelism, and Ntl's 
implementation is nearly optimal. Moreover, the time for the initial modular reduction of 
polynomials, still performed on the CPU, can become noticeably large, thereby neglecting 
the efficiency of the GPU algorithm. Yet, we find it very promising to perform the modular 
reduction directly on the GPU which should further speed-up our algorithm. 



Contestants. For solving bivariate systems (Section 6.1 ), we compared BlSOLVE to the bivariate 
version of Isolate (based on Rs) and Lgp by Xiao-Shan Gao et al, 22 Both are interfaced 
using Maple 14. We remark that, for the important substep of isolating the real roots of the 
elimination polynomial, all three contestants in the ring (including our implementation) use the 
highly efficient implementation provided by Rs. 



When analyzing algebraic curves (Section 6.2) and computing arrangements of algebraic 



curves (Section 6.3), we compared our new implementation with CGAL's bivariate algebraic 
kernel (see [38j and |3H]) that has shown excellent performance in exhaustive experiments over 
existing approaches, namely CAD2Ep^| and ISOTOP [H] which is based on Rs. These two other 
contestants were, except for few example instances, less efficient than CGAL's implementation, 
so that we omit further tests with them. Two further reasons can be given: Firstly, we enhanced 
Cgal's kernel with GPU-supported resultants and gcds which makes it more competitive to 
existing software, but also to GeoTop. Still, slowdowns are observable for singular curves 
or curves in non-generic position due to its need of subresultants sequences performed on the 
CPU. For such hard instances, our new algorithms particularly profit from the algorithmic 
design which avoids costly symbolic operations that can only be performed on the CPU. At 
this point, we also remark that, even if no GPU is available and all symbolic operations would 
be carried out solely on the CPU, GeoTop is still much faster for hard instances. Secondly, 
the contestants based on Rs require as subtask Rs to solve the bivariate polynomial system 
/ = f y = in the curve-analysis. However, our experiments on bivariate system solving that 



we report in Section 6.1 show that BlSOLVE is at least competitive to the current version of 
Rs and even show in most cases an excellent speed gain over Rs. However, it should not be 
concealed that Rs is currently getting a very promising polish which uses the computations of a 
rational univariate representations and modular arithmetic jl6]. Yacine Bouzidi et al. are about 
to submit a bivariate kernel based on the updated Rs to Cgal in the spirit of the existing 
univariate kernel based on RS; see |47| . We are looking forward to compare our analysis and 
the arrangement computation with this upcoming approach. 

All test data sets that we consider in our experiments are available for download 



21 A Library for Doing Number Theory, http://www.shoup.net/ntl/ 

22 Lgp: 



http : //ww w . mmrc . iss . ac . cn/ ~xgao/sof twa re . htmlf " 



23 http : //www . usna . edu/Users/cs/qepcad/B/QEPCAD . html 



http : //www . mpi- inf . mpg . de/departments/dl/pro j ects/Geometry/TCS- SNC . zip 
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6.1. Bivariate system solving 

Our experiments for this task consist of two parts: In the first part, we consider "special" 
curves C = V(f), and compute the x-critical points of C (i.e. the solutions of / = f y = 0). 
The curves are selected in order to challenge different parts of our algorithm (and also other 
algorithms), and in order to show the efficiency of the considered filtering techniques as given 
in Section |5.2 For instance, we considered curves with many singularities or high- curvature 
points which requires many candidates to be tested along each vertical line, or prohibit the 
use of special filters. Table [T] lists timings for various curves (described in Table B.5[ ). In the 
second part of our experiments, we study the performance of BlSOLVE on random polynomials 
with increasing total degrees and coefficient bit-lengths. We refer the reader to Table [2] for the 
corresponding timings. 

In columns 2-6 of Table [T] we see the performance of BlSOLVE with all filters set off 
(BS), with bitstream filter enabled only (BS+BSTR), with bitstream and combinatorial filter 
(BS+BSTR+COMB) and with all filters enabled (BS+ALL); the latter configuration comes 
with and without the computation of symbolic operations on the GPU. For the remaining 
configurations, we only show the timings using the GPU. The corresponding CPU-based timings 
can easily be obtained by adding the (absolute) difference of the BS+ALL-columns. 

One can observe that our algorithm is, in general, superior to ISOLATE and LGP, even if the 
filters are not used. By comparing columns 2-6 of Table [Tj one can see that filtering sometimes 
results in a significant performance improvement. The combinatorial test is particularly useful 
when the defining polynomials of the system (2.1) have large degrees and/or large coefficient bit- 
length while, at the same time, the number of covertical or singular solutions is small compared 
to the total number of candidates being checked. The bidirectional filter is advantageous when 
the system has covertical solutions in one direction (say along y-axis) which are not cohorizontal. 
This is essentially the case for cov_sol_20, swinnerton_dyer, ten_circles and curve_issac. 

Another strength of our approach relates to the fact that the amount of symbolic operations 
is crucially reduced. Hence, when the time for computing resultants is dominating, the GPU- 
based algorithm offers a speed-up by a typical factor of 2-5 (sometimes even more; see, in 
particular, SA_4_4_eps, degree_7_surf, hard_one) over the version with default resultant 
implementation. It is also worth mentioning that both ISOLATE and LGP benefit from the 
fast resultant computation available in Maple while CGAL's default resultant computation^ is 
generally much slower than that of Maple. 

Table [2] lists timings for experiments with random curves. Each instance consists of five 
curves of the same degree (dense or sparse) and we report the total time to compute the solutions 
of five systems of the form / = f y = 0. In order to analyze the influence of the coefficients' 
bit-lengths, we multiplied each curve by 2 k with k G {128, 512, 2048} and increased the constant 
coefficient by one. Since the latter operation constitutes only a small perturbation of the 
vanishing set of the input system, the number of solutions remains constant while the content 
of the polynomials' coefficients also stays trivial. We see that the bidirectional filtering is not of 
any advantage because the system defined by random polynomials is unlikely to have covertical 
solutions. However, in this case, most candidates are rejected by the combinatorial check, 
thereby omitting a (more expensive) test based on Theorem |lj This results in a clear speed-up 
over a "non-filtered" version. Also, observe that, compared to its contestants, GPU-BlSOLVE is 
less vulnerable to increasing the bit-length of coefficients. We have also observed that, for our 
filtered versions, the time for the validation step is almost independent of the bit-lengths. 

Further experiments on solving bivariate systems of interpolated, parameterized, translated 
or projected curves are listed in Appendix C In all these tests BlSOLVE outperforms LGP 
and ISOLATE; the CPU-only version of BlSOLVE is at least as efficient as the contestants, and 



Authors are indebted to Cgal developers working on resultants. 
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(X) special curves (see Table B.5 in Appendix B for descriptions) 





BS 


BS+BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


curve 




GPL 




GPU 


CPU 


Maple 


Maple 


13_sings 9 


2.13 


1.84 


1.48 


0.97 


1.65 


341.93 


2.81 


FTT 5 4 4 


48.03 


9.20 


9.00 


20.51 


52.21 


256.37 


195.65 


L4 circles 


0.92 


1.31 


1.62 


0.74 


1.72 


1.31 


7.58 


L6 circles 


3.91 


4.23 


3.68 


2.60 


16.16 


21.37 


51.60 


SA 2 4 eps 


0.97 


0.38 


0.32 


0.44 


4.45 


3.31 


4.69 


SA_4_4_eps 


4.77 


2.07 


1.84 


2.01 


91.90 


158.63 


54.51 


challenge 12 


21.54 


5.33 


5.44 


7.35 


18.90 


44.02 


37.07 


challenge 12 1 


84.63 


12.50 


12.50 


19.17 


72.57 


351.62 


277.68 


compact surf 


12.42 


3.45 


3.29 


4.06 


12.18 


871.95 


12.00 


cov sol 20 


28.18 


24.05 


18.82 


5.77 


16.57 


532.41 


171.62 


curve24 


85.91 


87.92 


13.93 


8.22 


25.36 


86.04 


37.94 


curve issac 


2.39 


2.72 


2.25 


0.88 


1.82 


29.80 


3.29 


cusps and flexes 


1.17 


1.09 


0.86 


0.63 


1.27 


381.51 


2.43 


degree 7 surf 


29.92 


13.14 


11.92 


7.74 


90.50 


timeout 


131.25 


dfold 10 6 


3.30 


2.68 


2.73 


1.55 


17.85 


3.35 


3.76 


grid deg 10 


2.49 


2.37 


1.30 


1.20 


2.49 


111.20 


2.64 


huge cusp 


9.64 


9.81 


6.96 


6.44 


13.67 


timeout 


116.67 


mignotte xy 


t>600 


584.75 


252.94 


243.16 


310.13 


564.05 


timeout 


spider 


167.30 


77.86 


50.61 


46.47 


216.86 


timeout 


timeout 


swinnerton dyer 


28.39 


19.70 


18.92 


5.28 


24.38 


71.14 


27.92 


ten circles 


4.62 


4.19 


4.13 


1.33 


3.74 


5.77 


4.96 



(X) pairs of special curves (see Table B.5 in Appendix B for descriptions) 





BS 


BS + BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


pair 




GPU 




GPU 


CPU 


Maple 


Maple 


degl8 7 curves 


2.19 


2.33 


1.74 


0.97 


2.01 


3.50 


4.37 


hard one 


11.34 


10.13 


6.46 


4.29 


82.53 


64.50 


17.45 


large curves 


286.32 


260.35 


72.50 


43.12 


35.37 


311.61 


98.07 


spiral29_24 


207.47 


206.62 


30.35 


18.57 


35.53 


215.35 


76.50 


tryme 


64.77 


65.55 


22.67 


18.61 


48.21 


397.41 


107.80 


vert lines 


0.60 


0.61 


0.63 


0.47 


0.69 


5.79 


1.20 



Table 1: Running times (in seconds, including resultant computations) for solving bivariate 
system defined by special curves. BlSOLVE-GPU: our approach with GPU-resultants; BlSOLVE- 
CPU: our approach with CGAL's CPU-resultants; ISOLATE and LGP use Maple's implementation 
for the resultant computation. Bold face indicates the default setup for BlSOLVE; timeout: 
algorithm timed out (> 600 sec) 
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(R) sets of five ranc 


om dense curves 


















BS 


BS + BSTR 


BS- 


-BSTR+COMB 
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BS+ALL 


Isolate 


LGP 


degree, bits 




GPL 






GPU 


CPU 


Maple 


Maple 


6, 


10 


0.29 


0.33 




0.31 


0.20 


0.38 


0.54 


0.41 


6, 


128 


0.47 


0.29 




0.34 


0.26 


0.31 


0.64 


0.66 


6, 


512 


0.99 


0.69 




0.56 


0.43 


0.54 


1.76 


1.91 


6, 


2048 


5.92 


3.18 




1.99 


1.50 


1.85 


9.31 


9.92 


9, 


10 


2.06 


0.88 




0.74 


0.36 


0.78 


1.24 


0.88 


9, 


128 


3.31 


1.85 




1.04 


0.45 


0.54 


1.50 


1.66 


9, 


512 


7.98 


4.81 




2.39 


0.88 


1.07 


3.62 


4.58 


9, 


2048 


34.12 


19.87 




11.15 


3.75 


4.19 


19.24 


24.66 


12, 


10 


14.85 


4.82 




2.46 


1.07 


2.11 


3.96 


3.32 


12, 


128 


20.08 


7.90 




3.78 


1.32 


1.59 


5.77 


6.39 


12, 


512 


42.73 


18.22 




10.10 


2.45 


2.80 


19.12 


23.17 


12, 


2048 


162.11 


68.28 




53.03 


11.14 


11.97 


109.67 


138.06 


15, 


10 


56.40 


10.64 




5.69 


1.55 


2.66 


6.09 


5.65 


15, 


128 


95.35 


17.00 




10.61 


2.01 


2.30 


8.96 


10.46 


15, 


512 


195.01 


41.42 




31.16 


3.95 


4.22 


26.06 


33.87 


15, 2048 


timeout 


161.00 




169.77 


19.89 


20.45 


140.68 


190.86 


(R) sets of five random sparse curves 






BS 


BS + BSTR 


BS- 


-BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


degree 


, bits 




GPL 






GPU 


CPU 


Maple 


Maple 


6, 


10 


0.11 


0.10 




0.16 


0.10 


0.13 


0.19 


0.14 


6, 


128 


0.28 


0.12 




0.14 


0.11 


0.15 


0.23 


0.21 


6, 


512 


0.50 


0.32 




0.24 


0.20 


0.21 


0.48 


0.47 


6, 


2048 


3.32 


1.28 




0.65 


0.58 


0.68 


2.12 


2.15 


9, 


10 


0.20 


0.52 




0.27 


0.18 


0.24 


0.39 


0.31 


9, 


128 


0.45 


0.92 




0.33 


0.22 


0.25 


0.51 


0.52 


9, 


512 


1.21 


1.82 




0.54 


0.37 


0.40 


1.44 


1.49 


9, 


2048 


7.52 


11.02 




1.96 


1.21 


1.38 


7.44 


8.42 


12, 


10 


0.51 


0.72 




0.55 


0.28 


0.38 


0.65 


0.53 


12, 


128 


1.49 


1.61 




0.75 


0.36 


0.36 


1.08 


1.11 


12, 


512 


5.17 


5.75 




1.67 


0.66 


0.69 


3.61 


3.83 


12, 


2048 


47.19 


42.35 




7.98 


2.70 


2.75 


21.25 


23.89 


15, 


10 


3.66 


3.33 




2.11 


1.00 


1.39 


2.48 


2.25 


15, 


128 


12.14 


6.37 




3.35 


1.25 


1.35 


4.17 


4.27 


15, 


512 


43.36 


19.93 




8.52 


2.40 


2.54 


13.95 


15.48 


15, 2048 


408.90 


150.49 




44.34 


10.97 


10.98 


78.65 


89.35 



Table 2: Total running times for solving five systems defined by random curves of increasing 
degree and with increasing bit-lengths. For description of configurations, see Table fTl 
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often even faster. We omit experiments to refine the solution boxes to certain precision as 
this matches the efficiency of Q1R due to the fact that we have algebraic descriptions for the 
solutions' x- and ^/-coordinates. 

6.2. Analysing curves 

We next present the experiments comparing the analyses of single algebraic curves for different 
families of curves: (R) random curves of various degree and bit-lengths of their coefficients, 
(1) curves interpolated through points on a grid, (S) curves in the two-dimensional parameter 
space of a sphere, (T) curves that were constructed by multiplying a curve f(x, y) with f(x,y+l), 
such that each fiber has more than one critical point, (P) projections of intersections of algebraic 
surfaces in 3D and, finally, (G) sets of three generated curves of same degree: (G.l) bivariate 
polynomials with random uniform coefficients (non-singular), (G.2) projected intersection curves 
of a random surface and its z-derivative (singular- f-f z ), and (G.3) projected intersection curves 
of two independently chosen surfaces (singular-/-^) (X) "special" curves of degrees up to 42 with 
many singularities or high- curvature points. The random and special curves were already under 
consideration in Section |6.1| where we only computed their x-critical points. All other curves 



are taken from [15> 4.3]. For the curve topology analysis, we consider five different setups: 

(a) BS+ALL (i.e. BlSOLVE with all filters enabled) which is, strictly speaking, not comparable 
with the curve-analysis as it only computes the solutions of the system / = f y = 0. Still, 
it is interesting to see that, for most instances, GeoTop outperforms BlSOLVE though 
BlSOLVE one only solves a subproblem of the curve- analysis. 

(b) Ak_2 is the bivariate algebraic kernel shipped with Cgal 3.7 but with GPU-supported 
resultants and gcds. 

(c) GeoTop-BS that exclusively uses Lift-BS for the fiber liftings. 

(d) TOP-NT that first applies a random shearing (with a low-bit shearing factor), and, then, 
exclusively uses Lift-NT for lifting step. 

(e) GeoTop combines Lift-NT and Lift-BS in the fiber computations as discussed in 



Section 3.2.3 It uses LlFT-NT first, and if it fails for a certain fiber after a certain number 
of iterations, LlFT-BS is considered for this fiber instead. 

We remark that the global modular filter that checks whether LlFT-NT is successful for all 
fibers, is not yet in action. So far, this test has only been implemented within Maple. As 
expected, it performs very well, that is, the run-times are considerably less than that for 
the majority of steps in the curve analysis. 
GeoTop is our default setting, and its running time also includes the timing for the fiber 
computations where LlFT-NT fails and LlFT-BS is applied instead. 

Table [3] lists the running times for single-curve analyses. We only give the results for 



representative examples; full tables are listed in |Appendix D[ From our experiments, we 



conclude that GeoTop is, in general, superior to the existing kernel, even though CGAL's 
original implementation now profits from GPU-accelerated resultants and gcds. Moreover, while 
the speed-up for curves in generic position is already considerable (about half of the time), it 
becomes even more impressive for projected intersection curves of surfaces and "special" curves 
with many singularities. The reason for this tremendous speed-up is that, for singular curves, 
Ak_2's performance drops significantly with the degree of the curve when the time to compute 
subresultants on the CPU becomes dominating. In addition, for curves in non-generic position, 
the efficiency of Ak_2 is affected because a coordinate transformation has to be considered in 
these cases. 

Recall that Lift-NT in GeoTop fails for very few instances, where Lift-BS is locally used 
to treat some of the x-critical fibers instead. The switch to the backup method is observable in 
timings; see for instance, challenge_12. Namely, the difference of the running times between 
GeoTop and GeoTop-BS are considerably less than the difference which can usually be 
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type, degree, bits 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


dense, 09, 10 
dense, 09, 2048 


0.36 
3.75 


0.66 
3.48 


1.50 
10.61 


0.29 
2.03 


0.23 
2.16 


dense, 15, 10 
dense, 15, 2048 


1.55 
19.89 


2.15 
16.86 


5.81 
54.58 


0.96 
7.74 


0.92 
13.24 


sparse, 09, 10 
sparse, 09, 2048 


0.18 
1.21 


1.05 
4.46 


0.54 
2.79 


0.20 
1.38 


0.11 
0.68 


sparse, 15, 10 
sparse, 15, 2048 


1.00 
10.97 


3.37 
22.78 


3.03 
24.85 


0.71 
5.47 


0.59 
5.46 





(R) sets of five random curves 



(I) sets of five interpolated curves through points on a grid 



degree 



BS 



+ ALL 



Ak 2 



GeoTop-BS 



Top-NT 



GeoTop 



12 
15 



3.70 
23.09 
214.54 



4.98 
27.56 
160.36 



9.49 
57.91 
451.29 



1.59 
12.37 
69.20 



2.37 
13.61 
114.63 



(S) sets of five parameterized curves on a sphere with 16bit-coefficients 



degree 



BS 



+ ALL 



Ak 2 



GeoTop-BS 



Top-NT 



GeoTop 



3.00 
30.87 



12.62 
39.74 



16.12 
119.61 



1.97 
27.49 



1.98 
21.37 



(T) sets of five curves with a vertically translated copy 



degree 
6 



BS+ALL 

1.32 
5.05 



Ak_2 
12.69 
134.75 



GeoTop-BS 
8.59 
27.93 



Top-NT 
0.77 
5.39 



GeoTop 
0.67 
2.23 



(P) projected intersection curve of surfaces with 8bit-coefficients 



degree(s) 



BS+ALL 



Ak 2 



GeoTop-BS 



Top-NT 



GeoTop 



6 • 6 



1.40 
21.86 



220.02 
timeout 



383.45 
117.57 



2.57 
19.56 



(G) random singular and non-singular curves 



0.68 
6.17 



type 



degree, bits 



BS 



+ ALL 



Ak 2 



GeoTop-BS 



Top-NT 



GeoTop 



non-singular 42, 237 
singular-/-/, 42, 238 
singular-/-^ 42, 237 



56.57 
64.24 
122.20 



40.66 
timeout 
timeout 



133.12 
372.99 
419.16 



23.27 
52.27 
39.55 



35.80 
25.50 
18.77 



(X) special curves (see Table B.5 in Appendix B for descriptions) 



curve 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


L6 circles 


2.60 


171.86 


108.46 


1.61 


1.62 


SA_4_4_eps 


2.01 


122.30 


11.96 


3.92 


2.00 


challenge 12 


7.35 


timeout 


16.11 


64.75 


12.50 


compact surf 


4.06 


81.56 


19.66 


7.43 


5.31 


cov sol 20 


5.77 


43.40 


14.06 


4.22 


2.41 


degree 7 surf 


7.74 


timeout 


57.41 


6.23 


4.19 


dfold_10_6 


1.55 


35.40 


10.74 


8.97 


0.90 


mignotte xy 


243.16 


timeout 


276.89 


199.59 


128.05 


spider 


46.47 


timeout 


200.61 


22.34 


21.03 


swinnerton dyer 


5.28 


347.28 


43.78 


13.04 


6.97 


ten circles 


1.33 


22.77 


11.84 


4.26 


0.86 



Table 3: Running times (in sec) for analyses of algebraic curves of various families; timeout: 
algorithm timed out (> 600 sec) 
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observed for instances where the filter method succeeds for all fibers. In these cases, the 
numerical solver cannot isolate the roots within a given number of iterations, or we indeed 



have n a < n£ for some fibers x = a; see Section 3.2.2 Nevertheless, the running times are 
still very promising and yet perform much better than Ak_2 for non-generic input, even 
though LlFT-BS's implementation is not yet optimized, and we anticipate a further performance 
improvement. 

Similar as Ak_2 has improved on previous approaches when it was presented in 2008, our 
new methods improve on Ak_2 now. That is, for random, interpolated and parameterized 
curves, the speed gain is noticeable, while for translated curves and projected intersections, we 
improve the more the higher the degrees. On some curves of large degree(!), we improve by a 
factor up to 250 and more. 

We also recommend GeoTop over TOP-NT since it gives full geometric information at 
basically no additional cost; that is, for random instances, both are similarly efficient whereas, 
for non-random input, the winner is often determined by the geometry of the curve. For instance, 
the projection step in TOP-NT is faster than that of GeoTop for random and interpolated 
curves. We cannot fully explain this observation, but we guess that the initial shearing results in 
a better separation of the resultant's roots which makes the real root isolation cheaper. On the 
other hand, for curves with many covertical critical points (e.g. challenge_12), shearing yields 
a resultant which decomposes into less but more complex factors, which implies much higher 
cost to isolate the roots of the resultant polynomial. In addition, we have to consider more 
x-critical fibers, and LlFT-NT also has to deal with larger bitlengths. In summary, we propose 
to not consider a shearing because, from our experiments, we can say that the increased cost 
are higher than the cost for the few needed runs of LlFT-BS, when GeoTop analyses the curve 
in the original coordinate system, s Unlike existing algorithms, GeoTop exhibits a very robust 
behavior on singular inputs. In contrast, it often performs even better on singular instances 
than on non-singular curves which have the same input size. This behavior can be read off in 
detail from Table |D.9| in |Appendix D[ where we compare curves of same degree without and 



with singularities. For large instances, GeoTop noticeably outperforms the other contestants 
and actually even benefits from singularities. We suspect that this behavior is due to the fact 
that the resultant splits into many simple factors. Namely, in this case, root isolation of the 
resultant becomes less costly than in the non-singular case, where the resultant does not yield 
such a strong factorization. 

The drastically improved analyses of algebraic curves has also some impact on the performance 
for analyzing algebraic surfaces. The approach in |48j is crucially based on the analysis of the 
projected silhouette curve of the surface f(x, y, z) — (i.e. res(/, f z ; z) — 0). The latter analysis 
turns out to be the main bottleneck using CGAL's algebraic kernel (AK_2; see column 3 in 
Table [3]). In particular, for projected intersection curves of two surfaces, GeoTop behaves 
drastically (typically by a factor 100 and more) better than AK_2. Hence, we claim that the 
maximal reasonable degree of surfaces that can be analyzed using the approach from [38] grows 
from approximately 5 — 6 to 8 — 10. 

6.3. Computing arrangements 

For arrangements of algebraic curves, we compare two implementations: 

(A) Ak_2 is CGAL's bivariate algebraic kernel shipped with CGAL 3.7 but with GPU-supported 
resultants and gcds. 

(B) GeoTopAK_2 is the same but uses GeoTop to analyze single algebraic curves. For the 
curve pair analyses, GeoTopAK_2 exploits Ak_2's functionality whenever subresultant 
computations are not needed (i.e. a unique transversal intersection of two curves along 
a critical event line). For more difficult situations (i.e. two covertical intersections or a 
tangential intersection), the curve pair analysis uses BlSOLVE as explained in Section |4j 
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Our testbed consists of sets of curves from different families: (F) random rational functions of 
various degree (C) random circles (E) random ellipses (R) random curves of various degree and 
coefficient bit-length (P) sets of projected intersection curves of algebraic surfaces, and, finally, 
(X) combinations of "special" curves. 



(P) increasing number of projected surface intersections 



#resulta 



nts 


Ak_2 


GeoTopAK_2 


2 


0.49 


0.21 


3 


0.93 


0.48 


4 


1.64 


1.03 


5 


3.92 


2.44 


6 


7.84 


5.14 


7 


21.70 


13.65 


8 


35.77 


22.69 


9 


67.00 


41.53 


10 


91.84 


58.37 



(X) combinations of special curves 



^curves 


Ak_2 


GeoTopAK_2 


2 


81.93 


9.2 


3 


148.46 


25.18 


4 


730.57 


248.87 


5 


836.43 


323.42 


6 


3030.27 


689.39 


7 


3313.27 


757.94 


8 


timeout 


1129.98 


9 


timeout 


1166.17 


10 


timeout 


1201.34 


11 


timeout 


2696.15 



Table 4: Running times (in sec) for computing arrangements of algebraic curves; timeout: 
algorithm timed out (> 4000 sec) 

We skip the tables for rational functions, circles, ellipses and random curves because the 
performance of both contestants are more or less equal: The linearly many curve-analyses are 
simple and, for the quadratic number of curve-pair analyses, there are typically no multiple 
intersections along a fiber, that is, BlSOLVE is not triggered. Thus, the execution paths of 
both implementations are almost identical, but only as we enhanced Ak_2 with GPU-enabled 
resultants and gcds. In addition, we also do not expect the need of a shear for such curves, thus, 
the behavior is anticipated. The picture changes for projected intersection curves of surfaces 
and combinations of special curves whose running times are reported in Table |4} The Ak_2 
requires for both sets expensive subresultants to analyze single curves and to compute covertical 
intersections, while GeoTopAK_2's performance is crucially less affected in such situations. 

7. Summary and Outlook 

We presented new algorithms to exactly compute with algebraic curves. By combining 
methods from different fields, we have been able to considerably reduce the amount of purely 
symbolic operations, and to outsource the remaining ones to graphics hardware. The majority 
of all computation steps is exclusively based on certified approximate arithmetic. As a result, 
our new algorithms are not only faster than existing methods but also capable to handle 
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geometric difficult instances at least as fast as seemingly easy ones. We believe that, with 
respect to efficiency, there is a good chance that exact and complete methods can compete with 
purely numerical approaches which do not come with any additional guarantee. The presented 
experiments seem to affirm this claim. 

We are confident that our new approach will also have some positive impacts in the following 
respect: There exist several non-certified (or non-complete) approaches either based on subdivi- 
sion [19| l50| EU |52j [53], |M] or homotopy methods [55] . They show very good behavior for most 
inputs. However, in order to guarantee exactness for all possible inputs (e.g. singular curves), 
additional certification steps (e.g. worst case separation bounds for subdivision methods) have to 
be considered, an approach which has not shown to be effective in practice so far. An advantage 
of the latter methods, compared to elimination approaches, is that they are local and do not 
need (global) algebraic operations. It seems reasonable that combining our algorithm with a 
subdivision or homotopy approach eventually leads to a certified and complete method which 
shows excellent "local" behavior as well. 

We further see numerous applications of our methods, in particular, when computing 
arrangements of surfaces. The actual implementation [IS] for surface triangulation is crucially 
based on planar arrangement computations of singular curves. Thus, we are confident that its 
efficiency can be considerably improved by using the new algorithm for planar arrangement 
computation. In addition, it would be interesting to extend our algorithm BlSOLVE to the task 
of solving a polynomial system of higher dimensions. 

The bit complexity analysis of BlSOLVE as presented in [23] hints to the fact that the total 
cost of BlSOLVE is dominated by the root isolation step for the elimination polynomial, and, for 
many instances, our experiments also confirm the latter claim. We aim to provide a proof for 
this behavior by means of a bit complexity analysis for GeoTop as well. 

Finally, we remark that Ak_2 has been integrated into a webdemo [56J which has already 
been used by numerous parties of interest. Certainly, we aim to update this webdemo by 
integrating the new algorithms from GeoTopAK_2 sinstead 
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Appendix A. Numerical Solver with Certificate 



In LlFT-NT (see Section 3.2.2), we deploy a certified numerical solver for a fiber polynomial 
to find regions certified to contain its complex roots. Bini and Fiorentino presented a highly 
efficient solution to this problem in their MPSOLVE package |57j . However, the interface 
of MPSOLVE only allows root isolation for polynomials with arbitrary, but fixed, precision 
coefficients. Our solver adapts their approach in a way suited to also handle the case where the 
coefficients are not known a priori, but rather in an intermediate representation which can be 
evaluated to any arbitrary finite precision. In particular, this applies in the setting of LlFT-NT, 
where the input features algebraic coefficients, represented as refineable isolating intervals of 
integer polynomials. 

The description given in this section is rather high-level, and chosen to cover the specific 
application LlFT-NT. For the details of an efficient implementation, we refer the reader to 
[19j. Let g(z) := f(a, z) = Y^=o9i z% e be a fiber polynomial at an x-critical value a and 
V{g) = id}, i = 1, • • • , n, its complex roots. Thus, g(z) = g n n£=i(z ~ &)■ 

Our numerical solver is based on the Aberth-Ehrlich iteration for simultaneous root finding. 
Starting from arbitrary distinct root guesses (£i)i=i,...,nj it is given by the component-wise 
iteration rule z\ — Zi if g(zi) — 0, and 

g(zi)/g'(zi) 



z' 



g(zi)/g'(zi) ■ 



otherwise. As soon as the approximation vector (zj)i lies in a sufficiently small neighborhood of 
some permutation of the actual roots of g, this iteration converges with cubic order [58] to 
simple roots. For roots of higher multiplicity or clustered roots, we use a variant of Newton's 
method to achieve quadratic convergence as an intermediate step between the Aberth-Ehrlich 
iterations. In practice, this combination shows excellent performance even if started with an 
arbitrary configuration of initial root guesses far away from the solutions. 

A straight-forward implementation of the Aberth-Ehrlich method in arbitrary-precision 
arithmetic requires the coefficients gi of g to be known up to some relative precision p, that is, 
the input is a polynomial g = ^(jiX 1 whose floating point coefficients satisfy \§i — g±\ < 2~ p \gi\. 
In particular, this requirement implies that we have to decide in advance whether a coefficient 
vanishes. However, in our application, a critical x-coordinate a of a fiber polynomial is not 
necessarily rational, and so are the coefficients of g. Thus, the restriction on the coefficients 
translates to expensive symbolic gcd computations of the resultant and the coefficients of the 
defining polynomial / of the curve, considered as a univariate polynomial in Z[y][a;]. 

Instead, we work on a Bitstream interval representation [gY of g (see [35] [19]). Its coefficients 
are interval approximations of the coefficients of g, where we require the width [gf — g^\ of 
each coefficient [gY = [g^,gf] to be < fi for a certain absolute precision \x. Thus, in contrast 
to earlier implementations, we have to decide whether gi = for the leading coefficient only. 
[gY represents the set {g : g~i G [gY} of polynomials in a ^-polynomial neighborhood of g; in 
particular, g itself is contained in [gY- Naturally, for the interval boundaries, we consider dyadic 
floating point numbers (bigfloats) . Note that we can easily compute arbitrarily good Bitstream 
representations of f(a, z) by approximating a to an arbitrary small error, for example using the 
quadratic interval refinement technique [28]. 

Starting with some precision (say, /i = 2 -53 ) and a vector of initial approximations, we 
perform Aberth's iteration on some representative g G [gY- The natural choice is the median 
polynomial with g~i = (g^ + gf)/2, but we take the liberty to select other candidates in case of 
numerical singularities in Aberth's rule (most notably, if g'(zi) = in some iteration). 

After a finite number of iterations (depending on the degree of g), we interrupt the iteration 
and check whether the current approximation state already captures the structure of V(g). 
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We use the following result by Neumaier and Rump [59j, founded in the conceptually similar 
Weierstrafi-Durand-Kerner simultaneous root iteration: 

Lemma 5 (Neumaier). Let g(z) = g n Y\™ =1 {z — Q) G C[z], g n 7^ 0. Let z ; 6 C for i = 1, . . . ,n 
be pairwise distinct root approximations. Then, all roots of g belong to the union T> of the discs 

Di := D(zi - r h \n\), 

u n Ui g(zi) 
where n := — • — and := = -. 

Moreover, every connected component CofT> consisting of m discs contains exactly m zeros of 
g, counted with multiplicity. 

The above lemma applied to [gY using conservative interval arithmetic yields a superset 
C = {Ci, . . . , C m } of regions and corresponding multiplicities Ai, . . . , A m such that, for each 
Ck G C, all polynomials g G [gY (and, in particular, g) have exactly A& roots in counted with 
multiplicities. Furthermore, once the quality of the approximations (2^ and [gY is sufficiently 
high, C converges to V(g). 

In LlFT-NT, where we aim to isolate the roots of g := /(a, y), we check whether m = m a = 
n+. If the latter equality holds, we are guaranteed that the regions Ck G C are isolating for 
the roots of g, and we stop. Otherwise, we repeat Aberth's iteration after checking whether 
G [gY(zi). Informally, if this holds the quality of the root guess is not distinguishable from 
any (possibly better) guess within the current interval approximation of g, and we double the 
precision (// = /^ 2 ) for the next stage. 

Aberth's iteration lacks a proof for convergence in the general case and, thus, cannot 
be considered complete. However, we feel this is a purely theoretical issue: to the best 
of our knowledge, only artificially constructed, highly degenerate configurations of initial 
approximations render the algorithm to fail. In our extensive experiments, this situation never 
occurred. From a theoretical point of view, it is possible to enhance the Aberth-Ehrlich method 
by a complete complex solver as a fallback method to ensure convergence of the root isolation. 
E.g., the CEVAL subdivision solver [601 161] can be extended to handle bitstream coefficients by 
employing perturbation bound techniques [62J. 

We note that regardless of this restriction, the regions Cf. G C are certified to comprise the 
roots of g at any stage of the algorithm by Neumaier's lemma and the rigorous use of interval 
arithmetic. In particular, the correctness of LlFT-NT and, thus, the completeness of the filtered 
curve analysis GeoTop is not affected. 
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Appendix B. Description of Special Curves 



Single curve 


deg„ 


Description 


13 sings 9 


9 


large coefficients; high-curvature points 


FTT_5_4_4* 


40 


many non-rational singularities 


L4 circles 


16 


4 circles w.r.t. L4-norm; clustered solutions 


L6 circles 


32 


4 circles w.r.t. L6-norm; clustered solutions 


SA_2_4_eps* 


16 


singular points with high tangencies, displaced 


SA_4_4_eps* 


33 


singular points with high tangencies, displaced 


challenge 12* 


30 


many candidate solutions to check 


challenge 12 1* 


40 


many candidates to be check 


compact surf 


18 


silhouette of an algebraic surface; many singularities and isolated solutions 


cov sol 20 


20 


covertical solutions 


curve24 


24 


curvature of degree 8 curve; many singularities 


curve issac 


15 


isolated points, high-curvature points [20] 


cusps and flexes 


9 


high-curvature points 


degree 7 surf 


42 


silhouette of an algebraic surface; covertical solutions in x and y 


dfold_10_6* 


30 


many half-branches 


grid deg 10 


10 


large coefficients; curve in generic position 


huge cusp 


8 


large coefficients; high-curvature points 


mignotte xy 


42 


a product of x/y- Mignotte polynomials, displaced; many clustered solutions 


spider 


12 


degenerate curve; many clustered solutions 


swinnerton dyer 


25 


covertical solutions in x and y 


ten circles 


20 


set of 10 random circles multiplied together; rational solutions 


Pairs of curves 


deg„ 


Description 


degl8 7 curves 


18, 7 


higher-order singularities on both curves 


hard one 


27, 6 


vertical lines as components of one curve; many candidates to check 


large curves 


24, 19 


large number of solutions 


spiral29_24 


29, 24 


Taylor expansion of a spiral intersecting a curve with many branches; 
many candidates to check 


tryme 


24, 34 


covertical solutions; many candidates to check 


vert lines 


16, 6 


high-order singularity on one curve, many intersections 



Table B.5: Description of the curves used in the first part of experiments. In case only a single 
curve given, the second curve is taken to be the first derivative w.r.t. y-variable. Curves marked 
with a star (*) are given in [63] - 
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Appendix C. Further experiments for bivariate system solving 



(1) sets of five interpolated curves through points on a grid 




BS 


BS + BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


degree 




GPU 


GPU 


CPU 


Maple 


Maple 


5 


0.29 


0.17 


0.32 


0.27 


0.38 


0.59 


0.51 


6 


1.20 


0.50 


0.67 


0.59 


0.71 


1.07 


1.12 


7 


4.52 


1.79 


1.35 


1.16 


1.37 


2.08 


2.32 


8 


14.86 


3.63 


2.55 


1.98 


2.51 


3.82 


4.20 


9 


63.46 


7.33 


5.19 


3.70 


4.50 


7.17 


7.99 


10 


194.04 


13.14 


8.96 


5.46 


6.71 


12.44 


13.76 


11 


timeout 


25.11 


19.59 


10.94 


12.31 


24.82 


28.61 


12 


timeout 


44.84 


41.88 


23.09 


25.23 


50.54 


55.56 


13 


timeout 


80.44 


84.29 


45.54 


49.92 


98.92 


110.02 


14 


timeout 


138.13 


191.25 


101.96 


103.91 


182.72 


205.26 


15 


timeout 


225.39 


376.17 


214.54 


219.39 


371.25 


399.64 


16 


timeout 


367.85 


timeout 


410.46 


427.50 


timeout 


timeout 


/Si cptc nf fi\/p nara 

1 Jl 3C L3 \J 1 MVC paid 


meterized curves on a sphere with 16bit-coefFicients 










BS 


BS + BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


degree 




GPU 


GPU 


CPU 


Maple 


Maple 


1 


0.06 


0.05 


0.1 


0.09 


0.12 


0.14 


0.13 


2 


0.23 


0.48 


0.24 


0.21 


0.36 


0.47 


0.40 


3 


3.28 


1.94 


0.53 


0.39 


0.66 


0.92 


0.87 


4 


26.62 


9.21 


1.38 


1.03 


2.07 


2.81 


2.65 


5 


241.74 


23.74 


3.22 


1.93 


4.24 


6.92 


6.05 


6 


timeout 


65.23 


6.26 


3.00 


6.21 


10.81 


10.01 


7 


timeout 


136.56 


19.81 


11.52 


21.33 


52.11 


50.37 


8 


timeout 


221.74 


38.8 


22.52 


35.77 


107.27 


107.84 


9 


timeout 


569.67 


66.19 


30.87 


50.00 


170.10 


169.87 


10 


timeout 


timeout 


117.21 


46.32 


69.99 


280.90 


277.94 


(T) sets of five curves with a vertically translated copy 




BS 


BS + BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


degree 




GPU 


GPU 


CPU 


Maple 


Maple 


5 


23.29 


1.38 


1.8 


0.93 


2.07 


2.02 


1.68 


6 


123.54 


3.31 


3.5 


1.32 


2.89 


3.17 


2.64 


7 


506.96 


7.73 


6.62 


2.15 


4.22 


4.43 


4.18 


8 


timeout 


13.32 


12.66 


2.84 


5.68 


6.42 


6.47 


9 


timeout 


25.95 


22.4 


5.05 


10.28 


11.09 


12.15 


10 


timeout 


41.67 


38.12 


5.19 


10.77 


12.28 


13.40 


(P) projected intersection curve of surfaces with 8bit-coefficients 




BS 


BS + BSTR 


BS+BSTR+COMB 


BS+all 


BS+ALL 


Isolate 


LGP 


degrees 




GPU 


GPU 


CPU 


Maple 


Maple 


3 • 3 


0.10 


0.11 


0.11 


0.08 


0.14 


0.18 


0.14 


4-4 


0.72 


0.46 


0.21 


0.07 


0.15 


0.18 


0.16 


5 • 5 


98.16 


27.09 


1.92 


1.00 


2.36 


3.25 


3.19 


6-6 


timeout 


48.52 


9.98 


1.40 


2.50 


3.17 


3.60 


7-7 


timeout 


timeout 


94.75 


19.90 


27.73 


29.38 


29.53 


8 • 8 


timeout 


timeout 


377.85 


21.86 


32.75 


46.02 


74.17 



Table C.6: Running times (in sec) for solving families of bivariate systems / = f y — 0; timeout: 
algorithm timed out (> 600 sec) 
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Appendix D. Further experiments for analysing curves 



(R) sets of five random dense curves 


degree, bits 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


06, 10 


0.20 


0.37 


0.71 


0.07 


0.14 


06, 128 


0.26 


0.35 


0.62 


0.10 


0.15 


06, 512 


0.43 


0.56 


1.15 


0.17 


0.29 


06, 2048 


1.50 


1.74 


4.25 


0.47 


0.98 


09, 10 


0.36 


0.66 


1.50 


0.29 


0.23 


09, 128 


0.45 


0.58 


1.21 


0.23 


0.29 


09, 512 


0.88 


1.00 


2.38 


0.60 


0.57 


09, 2048 


3.75 


3.48 


10.61 


2.03 


2.16 


12, 10 


1.07 


1.74 


4.54 


0.62 


0.65 


12, 128 


1.32 


1.45 


3.51 


0.66 


0.82 


12, 512 


2.45 


2.52 


7.37 


1.13 


1.49 


12, 2048 


11.14 


10.01 


33.72 


3.83 


6.95 


15, 10 


1.55 


2.15 


5.81 


0.96 


0.92 


15, 128 


2.01 


1.94 


4.92 


1.27 


1.20 


15, 512 


3.95 


3.53 


11.16 


1.91 


2.46 


15, 2048 


19.89 


16.86 


54.58 


7.74 


13.24 


(R) sets of five random sparse curves 


degree, bits 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


06, 10 


0.10 


0.22 


0.25 


0.06 


0.07 


06, 128 


0.11 


0.23 


0.26 


0.08 


0.08 


06, 512 


0.20 


0.34 


0.42 


0.12 


0.13 


06, 2048 


0.58 


1.07 


1.39 


0.42 


0.36 


09, 10 


0.18 


1.05 


0.54 


0.20 


0.11 


09, 128 


0.22 


1.00 


0.48 


0.27 


0.13 


09, 512 


0.37 


1.30 


0.78 


0.39 


0.20 


09, 2048 


1.21 


4.46 


2.79 


1.38 


0.68 


12, 10 


0.28 


1.62 


0.88 


0.21 


0.17 


12, 128 


0.36 


1.62 


0.93 


0.25 


0.22 


12, 512 


0.66 


2.45 


1.73 


0.47 


0.42 


12, 2048 


2.70 


8.49 


7.23 


1.89 


1.94 


15, 10 


1.00 


3.37 


3.03 


0.71 


0.59 


15, 128 


1.25 


3.87 


3.10 


0.99 


0.63 


15, 512 


2.40 


5.65 


5.88 


1.59 


1.22 


15, 2048 


10.97 


22.78 


24.85 


5.47 


5.46 



Table D.7: Running times (in sec) for analyses of random algebraic curves 
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(1) sets of five interpolated curves through points on 


■ 1 

a grid 






degree 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


5 


0.27 


0.51 


0.79 


0.18 


0.20 


6 


0.59 


0.87 


1.53 


0.31 


0.37 


7 


1.16 


1.69 


2.98 


0.49 


0.73 


8 


1.98 


2.88 


5.39 


1.09 


1.19 


9 


3.70 


4.98 


9.49 


1.59 


2.37 


10 


5.46 


7.62 


15.89 


3.36 


3.22 


11 


10.94 


13.52 


28.99 


5.51 


6.57 


12 


23.09 


27.56 


57.91 


12.37 


13.61 


13 


45.54 


46.90 


113.87 


18.20 


26.26 


14 


101.96 


88.76 


219.89 


43.99 


56.47 


15 


214.54 


160.36 


451.29 


69.20 


114.63 


16 


410.46 


312.27 


timeout 


69.65 


236.39 


(S) sets of five parameterized curves 


on a sphere with 16bit-coefficients 




degree 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


1 


0.09 


0.11 


0.18 


0.03 


0.07 


2 


0.21 


0.34 


0.68 


0.08 


0.17 


3 


0.39 


0.70 


1.51 


0.29 


0.26 


4 


1.03 


2.43 


4.73 


0.59 


0.71 


5 


1.93 


5.99 


10.17 


0.98 


1.33 


6 


3.00 


12.62 


16.12 


1.97 


1.98 


7 


11.52 


16.35 


49.50 


12.95 


7.42 


8 


22.52 


28.28 


84.85 


22.87 


14.04 


9 


30.87 


39.74 


119.61 


27.49 


21.37 


10 


46.32 


53.28 


154.56 


27.91 


28.16 


(T) sets of five curves 


with a vertically translated copy 






degree 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


5 


0.93 


5.72 


5.85 


0.55 


0.53 


6 


1.32 


12.69 


8.59 


0.77 


0.67 


7 


2.15 


29.40 


13.27 


1.22 


1.07 


8 


2.84 


66.13 


16.74 


2.03 


1.27 


9 


5.05 


134.75 


27.93 


5.39 


2.23 


10 


5.19 


286.69 


29.27 


5.71 


2.30 


(P) projected intersection curve of su 


rfaces with 8bit-coefficients 






degree(s) 


BS+ALL 


Ak_2 


GeoTop-BS 


Top-NT 


GeoTop 


3-3 


0.08 


0.15 


0.36 


0.05 


0.06 


4-4 


0.21 


0.67 


1.81 


0.35 


0.12 


5 • 5 


1.00 


3.94 


6.87 


1.33 


0.55 


6 • 6 


1.40 


220.02 


383.45 


2.57 


0.68 


7-7 


19.90 


timeout 


84.74 


7.11 


3.70 


8-8 


21.86 


timeout 


117.57 


19.56 


6.17 



Table D.8: Running times (in sec) for analyses of algebraic curves of various families; timeout: 
algorithm timed out (> 600 sec) 
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(Gj random singular and 


non- 


singular curves 










type degree, bits 


BS4 


-ALL 


Ak_ 


2 


GeoTop-BS 


Top-NT 


GeoTop 


non-singular 20, 160 


2.76 


2.15 


6.47 


0.84 


1.27 


singular-/-/, 20, 161 


4.82 


109.31 


16.59 


1.34 


1.43 


singular-/-^ 20, 160 


4.56 


115.96 


16.17 


2.36 


1.11 


non-singular 30, 199 


19.26 


12.51 


45.09 


5.08 


9.30 


singular- f-f z 30, 201 


20.34 


timeout 


60.45 


9.39 


5.32 


singular- /-# 30, 198 


29.89 


timeout 


90.79 


12.22 


5.38 


non-singular 42, 237 


56.57 


40.66 


133.12 


23.27 


35.80 


singular-/-/, 42, 238 


64.24 


timeout 


372.99 


52.27 


25.50 


singular-/-;? 42, 237 


122.20 


timeout 


419.16 


39.55 


18.77 


non-singular 56, 284 


367.99 


161.68 


timeout 


timeout 


1 OQ 053 

izy.oo 


singular-/-/, 56, 290 


214.05 


timeout 


timeout 


187.64 


121.79 


singular- /-# 56, 280 


timeout 


timeout 


timeout 


136.64 


77.53 




















(X) special curves (see Table 


B.5 


in 


Appendix B 


for descriptions) 
























curve 




-ALL 


Ak_ 


2 


GeoTop-BS 


Top-NT 


GeoTop 


13 sings 9 


0.97 


2.66 


3.74 


0.22 


0.61 


FTT_5_4_4 


20.51 


timeout 


32.07 


95.03 


27.81 


L4 circles 


0.74 


6.63 


12.41 


0.64 


0.45 


L6 circles 


2.60 


171.86 


108.46 


1.61 


1.62 


SA_2_4_eps 


0.44 


53.96 


2.35 


1.17 


0.29 


SA_4_4_eps 


2.01 


122.30 


11.96 


3.92 


2.00 


challenge 12 


7.35 


timeout 


16.11 


64.75 


12.50 


challenge 12 1 


19.17 


timeout 


48.95 


185.55 


35.65 


compact surf 


4.06 


81.56 


19.66 


7.43 


5.31 


cov sol 20 


5.77 


43.40 


14.06 


4.22 


2.41 


curve24 


I 


5.22 


38.22 


27.58 


8.36 


3.54 


curve issac 


0.8? 




2.63 


5.46 


0.33 


0.37 


cusps and flexes 


0.63 


2.09 


2.97 


0.57 


0.44 


degree 7 surf 


7.74 


timeout 


57.41 


6.23 


4.19 


dfold_10_6 


1.55 


35.40 


10.74 


8.97 


0.90 


grid deg 10 


1.20 


1.55 


3.19 


1.18 


0.73 


huge cusp 


6.44 


17.8 


8 


19.09 


3.34 


4.82 


mignotte xy 


243.16 


timeout 


276.89 


199.59 


128.05 


spider 


46.47 


timeout 


200.61 


22.34 


21.03 


swinnerton dyer 


5.28 


347.28 


43.78 


13.04 


6.97 


ten circles 


1.33 


22.77 


11.84 


4.26 


0.86 



Table D.9: Running times (in sec) for analyses of generated and special algebraic curves; 
timeout: algorithm timed out (> 600 sec) 
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